IBM Books

IBM LoadLeveler for AIX 5L: Using and Administering

llstatus - Query machine status

Purpose

Returns status information about machines in the LoadLeveler cluster. It does not provide status on any NQS machine.

Syntax

llstatus [-?] [-H][-R][-F] [-v] [-l] [-f category_list] [-r category_list] [hostlist]

Flags

-?
Provides a short usage message.

-H
Provides extended help information.

-R
Lists all of the machine consumable resources associated with all of the machines in the LoadLeveler cluster (when specified alone). When a host list is specified, the option only displays machine consumable resources associated with the specified hosts. This option should not be used with any other option.

-F
Lists all of the floating consumable resources associated with the LoadLeveler cluster. This option should not be used with any other option.

-v
Outputs the name of the command, release number, service level, service level date, and operating system used to build the command.

-l
Specifies that a long listing be generated for each machine for which status is requested. If -l is not specified, the standard list, described below, is generated.

-f category_list
Is a blank-delimited list of categories you want to query. Each category you specify must be preceded by a percent sign. The category_list cannot contain duplicate entries. This flag allows you to create a customized version of the standard llstatus listing. The output fields produced by this flag all have a fixed length. The output is displayed in the order in which you specify the categories. category_list can be one or more of the following:
%a
Hardware architecture
%act
Number of job steps dispatched by the schedd daemon on this machine
%cm
Custom Metric value
%cpu
Number of CPUs on this machine
%d
Available disk space in the LoadLeveler execute directory
%i
Number of seconds since last keyboard or mouse activity
%inq
Number of job steps in the job queue of this schedd machine
%l
Berkeley one-minute load average
%m
Physical memory on this machine
%mt
Maximum number of initiators that can be used simultaneously on this machine
%n
Machine name
%o
Operating system on this machine
%r
Number of initiators used by the startd daemon on this machine
%sca
Availability of the schedd daemon
%scs
State of the schedd daemon
%sta
Availability of the startd daemon
%sts
State of the startd daemon
%v
Available swap space (free paging space) of this machine

-r category_list
Is a blank-delimited list of categories you want to query. Each category you specify must be preceded by a percent sign. The category_list cannot contain duplicate entries. This flag allows you to create a customized version of the standard llstatus listing. The output produced by this flag is considered raw, in that the fields can be variable in length. The output is displayed in the order in which you specify the formats. Output fields are separated by an exclamation point (!). category_list can be one or more of the categories listed under the -f flag.

hostlist
Is a blank-delimited list of machines for which status is requested.

Description

If no hostlist is specified, all machines are queried.

If you have more than a few machines configured for LoadLeveler, consider redirecting the output to a file when using the -l flag.

Each machine periodically updates the central manager with a snapshot of its situation. Since the information returned by using llstatus is a collection of such snapshots, all taken at varying times, the total picture may not be completely consistent.

If you define consumable resources in the administration file, then llstatus displays this information when either the -R or -l option is specified. For the predefined ConsumableCpus resource, the "total" values reported by llstatus can be the values defined in the administration file or the values evaluated by the startd daemons. The startd values are used if the administration file values are set to "all." In this case, llstatus appends a plus (+) sign to the resource name in the output reports.

Examples

This example requests a long status listing for machines named silver and gold:

llstatus -l silver gold

Results

The Standard Listing: The standard listing is generated when you do not specify the -l option with the llstatus command. The following is sample output from the llstatus command, where there are two nodes in the cluster.



+--------------------------------------------------------------------------------+
|Name                      Schedd  InQ Act Startd Run LdAvg Idle Arch      OpSys |
|k10n09.ppd.pok.ibm.com    Avail     3   1 Run      1 2.72     0 R6000     AIX51 |
|k10n12.ppd.pok.ibm.com    Avail     0   0 Idle     0 0.00   365 R6000     AIX51 |
|                                                                                |
|R6000/AIX51            2 machines   3 jobs   1 running                          |
|Total Machines         2 machines   3 jobs   1 running                          |
|                                                                                |
|The Central Manager is defined on k10n09.ppd.pok.ibm.com                        |
|                                                                                |
|The GANG scheduler is in use                                                    |
|                                                                                |
|All machines on the machine_list are present.                                   |
+--------------------------------------------------------------------------------+

The standard listing includes the following fields:

Name
Hostname of the machine.

Schedd
State of the schedd daemon, which can be one of the following:
Down
Drned (Drained)
Drning (Draining)
Avail (Available)

For a detailed explanation of these states, see The schedd daemon.

InQ
Number of job steps in the job queue of this schedd machine.

Act
Number of job steps dispatched by the schedd daemon on this machine.

Startd
State of the startd daemon, which can be:
Busy
Down
Drned (Drained)
Drning (Draining)
Flush
Idle
None
Run (Running)
Suspnd (Suspend)

For a detailed explanation of these states, see The startd daemon.

Run
The number of initiators used by the startd daemon to run LoadLeveler jobs on this machine. One initiator is used for each serial job step and one initiator is used for each task of a parallel job step.

LdAvg
Berkeley one-minute load average on this machine.

Idle
The number of seconds since keyboard or mouse activity in a login session was detected. Highest number displayed is 9999.

Arch
The hardware architecture of the machine as listed in the configuration file.

OpSys
The operating system on this machine.

Total Machines
The standard listing includes the following summary fields:

machines
The number of machines in the cluster that have made a status report to the Central Manager.

jobs
The number of job steps in LoadLeveler job queues.

running
The number of initiators used by all the startd daemons in the LoadLeveler cluster. One initiator is used for each serial job step. One initiator is used for each task of a parallel job step.

Consumable Resources Listing: The llstatus command, issued with the -R option, generates a listing of all of the consumable resources associated with all of the machines in the LoadLeveler cluster. When a host list is specified, this option will only display resources associated with the specified hosts. The following is sample output from this command:

   llstatus -R

Figure 23. Sample llstatus -R command output


+--------------------------------------------------------------------------------+
|                                                                                |
|Machine                        Consumable Resource(Available, Total)            |
|------------------------------ -------------------------------------------------|
|c209f1n01.ppd.pok.ibm.com      ConsumableCpus(4,4)+ ConsumableMemory(1.000 gb,1.|
|c209f1n02.ppd.pok.ibm.com      ConsumableCpus(4,4)+ n02_res(123,500) Frame5(10,1|
|c209f1n05.ppd.pok.ibm.com      ConsumableCpus(4,4)+ ConsumableMemory(1.000 gb,1.|
|                                                                                |
|Resources with "+" appended to their names have the Total value reported from Startd|
+--------------------------------------------------------------------------------+

Floating Consumable Resources Listing: The llstatus command, issued with the -F option, generates a listing of all of the floating consumable resources associated with all of the machines in the LoadLeveler cluster. This option should not be specified with any other option. The following is sample output from this command:

   llstatus -F
+--------------------------------------------------------------------------------+
|                                                                                |
|Floating Resource              Available     Total                              |
|------------------------------ ------------- ---------------                    |
|EDA_licenses                   20            29                                 |
|Frame5                         15            20                                 |
|WorkBench6                     5             7                                  |
|XYZ_software                   6             6                                  |
|                                                                                |
|                                                                                |
+--------------------------------------------------------------------------------+

Customized, Formatted Standard Listing: A customized and formatted standard listing is generated when you specify llstatus with the -f option. The following is sample output from this command:

   llstatus -f %n %scs %inq %m %v %sts %l %o
+--------------------------------------------------------------------------------+
|Name             Schedd  InQ    Memory      FreeVMemory Startd  LdAvg  OpSys    |
|ll5.pok.ibm.com  Avail   0      128         22708       Run     0.23   AIX51    |
|ll6.pok.ibm.com  Avail   3      224         16732       Run     0.51   AIX51    |
|                                                                                |
|R6000/AIX51                 2 machines      3  jobs      3  running             |
|Total Machines              2 machines      3  jobs      3  running             |
|                                                                                |
|The Central Manager is defined on ll5.pok.ibm.com                               |
|                                                                                |
|The GANG scheduler is in use                                                    |
|                                                                                |
|All machines on the machine_list are present.                                   |
+--------------------------------------------------------------------------------+

Customized, Unformatted Standard Listing: A customized and unformatted (raw) standard listing is generated when you specify llstatus with the -r flag. Output fields are separated by an exclamation point (!). The following is sample output from this command:

llstatus -r %n %scs %inq %m %v %sts %l %o
+--------------------------------------------------------------------------------+
|ll5.pok.ibm.com!Avail!0!128!22688!Running!0.14!AIX51                            |
|ll6.pok.ibm.com!Avail!3!224!16668!Running!0.37!AIX51                            |
+--------------------------------------------------------------------------------+

The Long Listing: The long listing is generated when you specify the -l option with the llstatus command. Following the sample output is an explanation of all possible fields displayed by the llstatus command.

The following is sample output from the llstatus -l c209f1n05 command:

Figure 24. Sample output from llstatus -l c209f1n05


+--------------------------------------------------------------------------------+
|=============================================================================== |
|Name                = c209f1n05.ppd.pok.ibm.com                                 |
|Machine             = c209f1n05.ppd.pok.ibm.com                                 |
|Arch                = R6000                                                     |
|OpSys               = AIX51                                                     |
|SYSPRIO             = (0 -  QDate)                                              |
|MACHPRIO            = ((Memory +  FreeRealMemory) -  ((LoadAvg *  1000) +  Custo|
|VirtualMemory       = 491560 kb                                                 |
|Disk                = 519484 kb                                                 |
|KeyboardIdle        = 0                                                         |
|Tmp                 = 519484 kb                                                 |
|LoadAvg             = 1.802475                                                  |
|ConfiguredClasses   = Parallel(12) 85ba(2) misc(2) tiny(1) No_Class(7) small(14)|
|AvailableClasses    = Parallel(8) 85ba(2) misc(2) tiny(1) No_Class(7) small(14) |
|DrainingClasses     =                                                           |
|DrainedClasses      =                                                           |
|Pool                = 1 7                                                       |
|FabricConnectivity  = 1                                                         |
|Adapter             = en0(ethernet,c209f1n05.ppd.pok.ibm.com,9.114.99.66,)      |
|                      css0(switch,c209f1sn05.ppd.pok.ibm.com,9.114.99.130,,1,8/1|
|                      csss(striped,,,,1,8/16,120M/128M,1,READY)                 |
|Feature             = OSL ESSL                                                  |
|Max_Starters        = 50                                                        |
|Memory              = 1024 mb                                                   |
|FreeRealMemory      = 484 mb                                                    |
|PagesFreed          = 0                                                         |
|PagesScanned        = 0                                                         |
|PagesPagedIn        = 0                                                         |
|PagesPagedOut       = 0                                                         |
|ConsumableResources = ConsumableCpus(0,4)+ ConsumableMemory(724.000 mb,1.000 gb)|
|ConfigTimeStamp     = Fri Jul 27 11:44:29 EDT 2001                              |
|                                                                                |
+--------------------------------------------------------------------------------+
+--------------------------------------------------------------------------------+
|Cpus                = 4                                                         |
|Speed               = 1.000000                                                  |
|Subnet              = 9.114.99                                                  |
|MasterMachPriority  = 0.000000                                                  |
|CustomMetric        = 1                                                         |
|StartdAvail         = 1                                                         |
|State               = Running                                                   |
|EnteredCurrentState = Fri Jul 27 11:46:22 EDT 2001                              |
|START               = ((LoadAvg <  5.000000) &&  ((tm_hour >  8) &&  (tm_hour < |
|SUSPEND             = F                                                         |
|CONTINUE            = T                                                         |
|VACATE              = F                                                         |
|KILL                = F                                                         |
|Machine Mode        = general                                                   |
|Running             = 5                                                         |
|ScheddAvail         = 1                                                         |
|ScheddState         = Avail                                                     |
|ScheddRunning       = 2                                                         |
|Pending             = 0                                                         |
|Starting            = 0                                                         |
|Idle                = 5                                                         |
|Unexpanded          = 0                                                         |
|Held                = 0                                                         |
|Removed             = 0                                                         |
|RemovedPending      = 0                                                         |
|Completed           = 0                                                         |
|TotalJobs           = 7                                                         |
|Running steps       = c209f1n05.ppd.pok.ibm.com.5.0 c209f1n05.ppd.pok.ibm.com.6.|
|                      c209f1n05.ppd.pok.ibm.com.7.0                             |
|TimeStamp           = Fri Jul 27 11:46:22 EDT 2001                              |
+--------------------------------------------------------------------------------+

The long listing includes these fields:

Adapter
Network adapter information associated with this machine.

Arch
Hardware architecture of this machine.

AvailableClasses
List of available classes and the associated number of available initiators on this machine.

Completed
The number of job steps in this state on this schedd machine.

Config Time Stamp
Date and time of last configuration or reconfiguration.

ConfiguredClasses
List of configured classes and the associated number of configured initiators on this machine.

ConsumableResources
List of consumable resources associated with this machines. Each element of this list has the format: resource_name(available, total).

CONTINUE
The expression, defined following C conventions in the configuration file, that evaluates to true or false (T/F). This determines whether suspended jobs are continued on this machine.

Cpus
Number of CPUs on this machine.

CustomMetric
This value can be the number assigned to the CUSTOM_METRIC keyword or the exit code of the executable associated with the CUSTOM_METRIC_COMMAND keyword or the default value of 1.

Disk
Available space, in kilobytes (less 512KB) in LoadLeveler's execute directory on this machine.

DrainedClasses
List of classes which have been drained. If a job step is in a class named on this list, that job step will not start on this machine.

DrainingClasses
List of classes which are currently being drained on this machine. If a job step is in a class named on this list, that job step will not start on this machine.

Entered Current State
Date and time when machine state was set.

FabricConnectivity
A boolean vector representing the current state of connectivity between machine's switch adapters and the SP switch.

Feature
Set of all features on this machine.

FreeRealMemory
Free real memory, in megabytes, on this machine. This value should track closely with the "fre" value of the vmstat command and the "free" value of the svmon -G command whose units are 4KB blocks.

Held
The number of job steps in this state on this schedd machine.

Idle
The number of job steps in this state on this schedd machine.

Keyboard Idle
Number of seconds since last keyboard or mouse activity.

KILL
The expression, defined following C conventions in the configuration file, that evaluates to true or false (T/F). This determines whether jobs running on this machine should be sent the SIGKILL signal.

LoadAvg
Berkely one-minute load average on machine.

Machine
Fully qualified name of the machine.

Machine Mode
The type of job this machine can run. This can be: batch, interactive, or general.

MACHPRIO
Actual expression that determines machine priority, defined in the configuration file.

MasterMachPriority
The machine priority for the parallel master node.

Max_Starters
Maximum number of initiators that can be used simultaneously on this machine.

Memory
Physical memory, in megabytes, on this machine.

Name
Hostname of the machine.

OpSys
Operating system on this machine.

PagesFreed
Pages freed per second. This value corresponds to the "fr" value of the vmstat command output.

PagesPaged In
Pages paged in from paging space per second. This value corresponds to the "pi" value of the vmstat command output.

PagesPagedOut
Pages paged out to paging space per second. This value corresponds to the "po" value of the vmstat command output.

PagesScanned
Pages scanned by the page-replacement algorithm per second. This value corresponds to the "sr" value of the vmstat command output.

Pending
The number of job steps in this state on this schedd machine.

Pool
The identifier of the pool where this startd machine is located.

Removed
The number of job steps in this state on this schedd machine.

Remove Pending
The number of job steps in this state on this schedd machine.

Running
The number of initiators used by the startd daemon to run LoadLeveler jobs. One initiator is used for each serial job step. One initiator is used for each task of a parallel job step.

Running steps
The list of job steps currently running on this machine.

ScheddAvail
Flag indicating if machine is running a schedd daemon (0=no, 1=yes).

ScheddRunning
The number of job steps submitted to this machine that are running somewhere in the LoadLeveler cluster.

ScheddState
The state of the schedd daemon on this machine.

Speed
Speed associated with the machine.

START
The expression, defined following C conventions in the configuration file, that evaluates to true or false (T/F). This determines whether jobs can be started on this machine.

StartdAvail
Flag indicating if machine is running a startd daemon (0=no, 1=yes).

Starting
The number of job steps in this state on this schedd machine.

State
State of the startd daemon, which can be:

For a detailed explanation of these states, see The startd daemon.

Subnet
The TCP/IP subnet that this machine resides on.

SUSPEND
The expression, defined following C conventions in the configuration file, that evaluates to true or false (T/F). This determines whether running jobs should be suspended on this machine.

SYSPRIO
Actual expression that determines overall system priority of a job step. Defined in the configuration file.

TimeStamp
The date and time the central manager last received a status update from this schedd machine.

Tmp
Available space, in kilobytes (less than 512 KB) in the /tmp directory on this machine.

Total Jobs
The number of total job steps submitted to this schedd machine.

Unexpanded
The number of job steps in this state on this schedd machine.

VACATE
The expression, defined following C conventions in the configuration file, that evaluates to true or false (T/F). This determines whether suspended jobs are vacated on this machine.

Virtual Memory
Available swap space (free paging space) in kilobytes, on this machine.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]