llq -l output includes information on allocated host names. Another way to obtain the allocated host names is with the LOADL_PROCESSOR_LIST environment variable, which you can use from a shell script in your job command file as shown in Figure 15.
This example uses LOADL_PROCESSOR_LIST to perform a remote copy of a local file to all of the nodes, and then invokes POE. Note that the processor list contains an entry for each task running on a node. If two tasks are running on a node, LOADL_PROCESSOR_LIST will contain two instances of the host name where the tasks are running. The example in Figure 15 removes any duplicate entries.
Note that LOADL_PROCESSOR_LIST is set by LoadLeveler, not by the user. This environment variable is limited to 128 hostnames. If the value is greater than the 128 limit, the environment variable is not set.
Figure 15. Using LOADL_PROCESSOR_LIST in a shell script
#!/bin/ksh
# @ output = my_POE_program.$(cluster).$(process).out
# @ error = my_POE_program.$(cluster).$(process).err
# @ class = POE
# @ job_type = parallel
# @ node = 8,12
# @ network.MPI = css0,shared,US
# @ queue
tmp_file="/tmp/node_list"
rm -f $tmp_file
# Copy each entry in the list to a new line in a file so
# that duplicate entries can be removed.
for node in $LOADL_PROCESSOR_LIST
do
echo $node >> $tmp_file
done
# Sort the file removing duplicate entries and save list in variable
nodelist= sort -u /tmp/node_list
for node in $nodelist
do
rcp localfile $node:/home/userid
done
rm -f $tmp_file
/usr/bin/poe /home/userid/my_POE_program
|