IBM Books

IBM LoadLeveler for AIX 5L: Using and Administering

Step 2: Define LoadLeveler cluster characteristics

Use the following keywords to define the characteristics of the LoadLeveler cluster:

CUSTOM_METRIC = number
Specifies a machine's relative priority to run jobs. This is an arbitrary number which you can use in the MACHPRIO expression. Negative values are not allowed. If you specify neither CUSTOM_METRIC nor CUSTOM_METRIC_COMMAND, CUSTOM_METRIC = 1 is assumed. For more information, see Step 7: Prioritize the order of executing machines maintained by the negotiator.

CUSTOM_METRIC_COMMAND = command
Specifies an executable and any required arguments. The exit code of this command is assigned to CUSTOM_METRIC. If this command does not exit normally, CUSTOM_METRIC is assigned a value of 1. This command is forked every (POLLING_FREQUENCY * POLLS_PER_UPDATE) period.

MACHINE_AUTHENTICATE = true | false
Specifies whether machine validation is performed. When set to true, LoadLeveler only accepts connections from machines specified in the administration file. When set to false, LoadLeveler accepts connections from any machine.

When set to true, every communication between LoadLeveler processes will verify that the sending process is running on a machine which is identified via a machine stanza in the administration file. The validation is done by capturing the address of the sending machine when the accept function call is issued to accept a connection. The gethostbyaddr function is called to translate the address to a name, and the name is matched with the list derived from the administration file.

Note:MACHINE_AUTHENTICATE must be set as "true" for Gang scheduling to work. For more information see Restrictions for Gang scheduling and preemption.

SCHEDULER_TYPE and SCHEDULER_API
The last cluster characteristic that needs to be defined is the LoadLeveler scheduler. Two keywords are available for setting this configuration and each has multiple options. SCHEDULER_TYPE is the preferred keyword but SCHEDULER_API is still available for migration purposes. For more information, see Choosing a scheduler.

Choosing a scheduler

This section discusses the types of schedulers available and the keywords (SCHEDULER_TYPE and SCHEDULER_API) used to define which scheduler LoadLeveler will use.

Scheduler keyword definitions

Use the following keywords to define your scheduler:

SCHEDULER_TYPE = LL_DEFAULT | BACKFILL | API | GANG
This keyword sets the scheduler used by LoadLeveler. When SCHEDULER_TYPE is defined, the obsolete keyword SCHEDULER_API is ignored.

Notes:

  1. If a scheduler type is not set LoadLeveler will start, but it will use the default scheduler.

  2. If you have set SCHEDULER_TYPE with an option that is not valid, LoadLeveler will not start.

  3. If you change the scheduler option specified by SCHEDULER_TYPE, you must stop and restart LoadLeveler using llctl or recycle using llctl.

The SCHEDULER_TYPE definitions are:

LL_DEFAULT
Specifies the default LoadLeveler scheduling algorithm. If SCHEDULER_TYPE has not been defined, LoadLeveler will use the default scheduler (LL_DEFAULT).

BACKFILL
Specifies the LoadLeveler Backfill scheduler. When you specify this keyword, you should use only the default settings for the START expression and the other job control expressions described in Step 8: Manage a job's status using control expressions.

API
Specifies that you will use an external scheduler. External schedulers communicate to LoadLeveler through the job control API. For more information on setting an external scheduler, see Workload Management API.

GANG
Specifies that you will use the LoadLeveler Gang scheduling algorithm. For more information, see Using Gang scheduling.

SCHEDULER_API = YES | NO
The SCHEDULER_API keyword sets an external scheduler but it is now obsolete and should only be used for migration purposes. Use SCHEDULER_TYPE=API as a replacement for SCHEDULER_API=YES. If SCHEDULER_API has been set to YES and SCHEDULER_TYPE has not been defined, then SCHEDULER_API=YES is functionally equivalent to SCHEDULER_TYPE=API; LoadLeveler will ignore all other instances of SCHEDULER_API. For more information on setting an external scheduler, see Workload Management API.
Note:If you change the scheduler from a specified SCHEDULER_TYPE to SCHEDULER_API=YES, you must stop and restart LoadLeveler using llctl.

SCHEDULER_TYPE option details

Setting up file system monitoring

You can use the file system keywords to monitor the file system space used by LoadLeveler for:

You can also use the file system keywords to take preventive action and avoid problems caused by running out of file system space. This is done by setting the frequency that LoadLeveler checks the file system free space and by setting the upper and lower thresholds that initialize system responses to the free space available. By setting a realistic span between the lower and upper thresholds, you will avoid excessive system actions.

FS_INTERVAL = seconds
Defines the interval (in seconds) used when checking the size of the file system. If your file system receives many log messages or copies large executables to the LoadLeveler spool, the file system will fill up quicker and you should perform file size checking more frequently by setting the interval to a smaller value. LoadLeveler will not check the file system if the value of FS_INTERVAL is:
Note:If FS_INTERVAL is not specified but any of the other three keywords (FS_NOTIFY, FS_SUSPEND, or FS_TERMINATE) are specified, the FS_INTERVAL value will default to 5 and the file system will be checked.

FS_NOTIFY = lower threshold, upper threshold

This configuration file keyword defines when LoadLeveler notifies the administrator that there is a file system problem or that a file system problem has been resolved.

If the free space associated with the LoadLeveler file system drops below the lower threshold, LoadLeveler sends a mail message to the administrator indicating that logging problems may occur. When file system free space rises above the upper threshold (after passing the lower threshold), LoadLeveler sends a mail message to the administrator indicating that problem has been resolved.

Default value (in blocks): 1000, -1

The valid range for both the lower and upper thresholds are -1 and all positive integers. If the value is set to -1, the transition across the threshold is not checked.

FS_SUSPEND = lower threshold, upper threshold

This configuration file keyword defines when LoadLeveler drains and resumes the schedd and startd daemons running on a node.

If the free space associated with the LoadLeveler file system drops below lower threshold, LoadLeveler drains the schedd and the startd daemons if they are running on a node. When this happens, logging is turned off and mail notification is sent to the administrator.

When file system free space rises above the upper threshold (after passing the lower threshold), LoadLeveler signals the schedd and the startd daemons to resume. When this happens, logging is turned on and mail notification is sent to the administrator.

Default value (in blocks): -1, -1

The valid range for both the lower and upper thresholds are -1 and all positive integers. If the value is set to -1, the transition across the threshold is not checked.

FS_TERMINATE = lower threshold, upper threshold

This keyword sends the SIGTERM signal to the Master daemon which then terminates all LoadLeveler daemons running on the node.

If the free space associated with the LoadLeveler file system drops below lower threshold, all LoadLeveler daemons are terminated.

Note:Although the upper threshold setting for FS_TERMINATE is ignored when LoadLeveler is terminated, the upper threshold is still required on the statement.

Default value (in blocks): -1, -1

The valid range for the lower thresholds is -1 and all positive integers. If the value is set to -1, the transition across the threshold is not checked.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]