Workload Management API
(rc, errObj) = ll_cluster( host_list, CLUSTER_SET | CLUSTER_UNSET )
(rc, errObj) = ll_cluster_auth()
rc = ll_control( control_op, host_list, user_list, job_list, class_list, priority )
rc = llctl( LL_CONTROL_RECYCLE | LL_CONTROL_RECONFIG |
LL_CONTROL_START | LL_CONTROL_STOP |
LL_CONTROL_DRAIN | LL_CONTROL_DRAIN_STARTD |
LL_CONTROL_DRAIN_SCHEDD | LL_CONTROL_PURGE_SCHEDD |
LL_CONTROL_FLUSH | LL_CONTROL_SUSPEND | LL_CONTROL_RESUME |
LL_CONTROL_RESUME_STARTD | LL_CONTROL_RESUME_SCHEDD |
LL_CONTROL_FAVOR_JOB | LL_CONTROL_UNFAVOR_JOB |
LL_CONTROL_FAVOR_USER | LL_CONTROL_UNFAVOR_USER |
LL_CONTROL_HOLD_USER | LL_CONTROL_HOLD_SYSTEM |
LL_CONTROL_HOLD_RELEASE | LL_CONTROL_PRIO_ABS |
LL_CONTROL_PRIO_ADJ | LL_CONTROL_START_DRAINED |
LL_CONTROL_DUMP_LOGS,
host_list, class_list )
rc = llfavorjob( LL_CONTROL_FAVOR_JOB | LL_CONTROL_UNFAVOR_JOB, job_list )
rc = llfavoruser( LL_CONTROL_FAVOR_USER | LL_CONTROL_UNFAVOR_USER, user_list )
rc = llhold( LL_CONTROL_HOLD_USER | LL_CONTROL_HOLD_SYSTEM |
LL_CONTROL_HOLD_RELEASE, host_list, user_list, job_list )
(rc, errObj) = ll_modify( EXECUTION_FACTOR | CONSUMABLE_CPUS |
CONSUMABLE_MEMORY | WCLIMIT_ADD_MIN |
JOB_CLASS | ACCOUNT_NO | STEP_PREEMPTABLE |
SYSPRIO | BG_SIZE | BG_SHAPE | BG_CONNECTION |
BG_PARTITION | BG_ROTATE | BG_REQUIREMENTS |
RESOURCES | NODE_RESOURCES, value, job_step )
(rc, errObj) = ll_move_job( job_id, cluster_name )
rc = llprio( LL_CONTROL_PRIO_ABS | LL_CONTROL_PRIO_ADJ, job_list, priority )
(rc, errObj) = ll_preempt( job_step_id, PREEMPT_STEP | RESUME_STEP | SYSTEM_PREEMPT_STEP )
(rc, errObj) = ll_preempt_jobs( user_list, host_list, job_list, PREEMPT_STEP | RESUME_STEP,
LL_PREEMPT_SUSPEND | LL_PREEMPT_VACATE | LL_PREEMPT_REMOVE
LL_PREEMPT_SYS_HOLD | LL_PREEMPT_USER_HOLD )
(rc, errObj) = ll_run_scheduler()
rc = ll_start_job_ext( cluster, proc, from_host, node_list )
rc = ll_terminate_job( cluster, proc, from_host, msg )
The LoadLeveler Workload Management API via PyLoadL has the following functions:
ll_cluster
Function to set following function calls on a selected cluster or unselect a previous selected cluster.
(rc, errObj) = ll_cluster( cluster_list, cluster_op )
Parameters
- cluster_list
List which is currently restricted to a list of one cluster.
- cluster_op
- CLUSTER_SET - select cluster
- CLUSTER_UNSET - unselect cluster
ll_cluster_auth
Function to generate SSL keys, necessary for secure multicluster communications.
(rc, errObj) = ll_cluster_auth()
ll_control
Function to perform control operations against hosts, jobs, users or job classes.
rc = ll_control( control_op, host_list, user_list, job_list, class_list, priority )
Parameters
- control_op
- host_list
List of host machines to perform control operation on.
- user_list
List of users to perform control operation on.
- job_list
List of job step IDs to perform control operation on.
- class_list
List of users to perform control operation on.
- priority
Value to be assigned fro control operation.
- llfavoruser
Function to favour and unfavour given users, this is really just a wrapper function of ll_control.
rc = llfavoruser( LL_CONTROL_FAVOR_USER | LL_CONTROL_UNFAVOR_USER, user_list )
-
Parameters
- Operation
- LL_CONTROL_FAVOR_USER : Favour the users in user_list.
- LL_CONTROL_UNFAVOR_USER : Unfavour the users in user_list.
- user_list
List of users to perform hold operation on.
- llhold
Function to hold and release given job steps or users, this is really just a wrapper function of ll_control.
rc = llhold( LL_CONTROL_HOLD_USER | LL_CONTROL_HOLD_SYSTEM | LL_CONTROL_HOLD_RELEASE, host_list, user_list, job_list )
-
Parameters
- Hold Operation
- LL_CONTROL_HOLD_USER
Place on user hold.
- LL_CONTROL_HOLD_SYSTEM
Place on system hold, you need to be a LoadLeveler administer to perfrom this operation.
- LL_CONTROL_HOLD_RELEASE
Release from hold, you need to be a LoadLeveler adminster to perfrom this against system held jobs.
- host_list
List of host machines.
- user_list
List of users to perform hold operation on.
- job_list
List of job step IDs to perform hold operation on.
- llprio
Function to adjust the priorities of job steps, this is really just a wrapper function of ll_control.
rc = llprio( LL_CONTROL_PRIO_ABS | LL_CONTROL_PRIO_ADJ, job_list, priority )
-
Parameters
- Priority Operation
- LL_CONTROL_PRIO_ABS : New absolute priority value.
- LL_CONTROL_PRIO_ADJ : New adjusted priority value.
- job_list
List of job step IDs.
- priority
Priority value to assign to the list of job step IDs.
- ll_preempt
Function to preempt a running job step or to resume a job_step that has already been preempted through the LoadLeveler llpreempt command or via ll_preempt. ll_preempt cannot resume a job step preempted through PREEMPT_CLASS (system-initiated).
(rc, errObj) = ll_preempt( job_step, preempt_op )
-
Parameters
- job_step - The Job Step ID.
- preempt_op - Preemption operation, which can be the following -
- PREEMPT_STEP - Preempts the job step ID.
- RESUME_STEP - Resumes the job step ID.
- ll_preempt_jobs
Function to preempt a set of running job steps using the specified preempt method, or to resume job steps that have already been preempted with the preempt method of suspend through the llpreempt command or the ll_preempt_jobs routine. The ll_preempt_jobs routine cannot resume a job step that was preempted through the PREEMPT_CLASS rules, or a job step that was preempted with a preempt method other than suspend.
(rc, errObj) = ll_preempt_jobs( user_list, host_list, job_list, preempt_op, preempt_method )
-
Parameters
- user_list
List of users to be targeted.
- host_list
List of hosts to be targeted.
- job_list
List of job step IDs in the form host.job_id.step_id i.e shivling.5.0
- preempt_op - Preemption operation to perform
- PREEMPT_STEP
Preempts the job step.
- RESUME_STEP
Resumes the job step.
- preempt_method - Preemption method to perform
- LL_PREEMPT_SUSPEND
Preempts the job step.
- LL_PREEMPT_VACATE
Resumes the job step.
- LL_PREEMPT_REMOVE
Resumes the job step.
- LL_PREEMPT_SYS_HOLD
Resumes the job step.
- LL_PREEMPT_USER_HOLD
Resumes the job step.
- ll_modify
Function to modify the attributes of the submitted job step. This interface only supports one job step ID, the API also only allows one job step at present but it is designed for expansion, therefore this interface may change in the future.
(rc, errObj) = ll_modify( modify_op, value, job_step )
-
Parameters
- modify_op - The modify operation to perform.
- EXECUTION FACTOR : New execution factor, modify_data input is a numeric.
- CONSUMABLE_CPUS : New consumable cpus value, modify_data input is a numeric.
- CONSUMABLE_MEMORY : New consumable memory in megabytes, modify_data input is a numeric.
- WCLIMIT_ADD_MIN : Additional minutes to add to hard wallclock limit, modify_data input is a numeric.
- JOB_CLASS : New job class, modify_data input is a string.
- ACCOUNT_NO : Changes the account number to the specified value for an idle-like job step.
- STEP_PREEMPTABLE : Specifies whether a job is preemptable or nonpreemptable.
- SYSPRIO : Changes the q_sysprio for a job step to the specified integer value. The new job step priority will be fixed. This is a LoadLeveler administrator only option.
- BG_SIZE : Changes the size of an idle-like Blue Gene job. The subsequent value argument must be an integer in units of compute nodes. If this value is specified, any value previously specified for bg_shape or bg_partition will be ignored.
- BG_SHAPE : Changes the shape of an idle-like Blue Gene job. The subsequent value argument must be of the form "XxYxZ", where X, Y, and Z are integers in units of the number of base partitions. If this value is specified, any value previously specified for bg_size or bg_partition will be ignored.
- BG_CONNECTION : Changes the connection option of an idle-like Blue Gene job. The subsequent value argument must be a string that is either TORUS, MESH, or PREFER_TORUS.
- BG_PARITION : Changes the requested partition ID of an idle-like Blue Gene job. If this value is specified, any value specified for bg_connection, bg_shape, bg_size, or bg_rotate will be ignored.
- BG_ROTATE : Changes the rotate option of an idle-like Blue Gene job. The subsequent value argument must be a string that is either True or False.
- BG_REQUIREMENTS : Changes the memory requirement that a Blue Gene base partition in the LoadLeveler cluster must meet to run an idle-like Blue Gene job. The subsequent value option must be an expression. Memory is the only variable that is supported. bg_requirements cannot be modified if bg_partition is already specified.
- RESOURCES : Replaces the task resource requirements specified in the job command file at submit time. The entire resource requirement must be specified. The rules for the syntax of the resources_string are the same as the rules for the corresponding job command file keywords. Only a job step in an idle-like state can be changed. Any resource requirement that was originally specified and is omitted from this string will be removed from the job step. The command will fail if you specify the same resources in both the resources and node_resources statements.
- NODE_RESOURCES : Replaces the node resource requirements specified in the job command file at submit time. The entire resource requirement must be specified. The rules for the syntax of the resources_string are the same as the rules for the corresponding job command file keywords. Only a job step in an idle-like state can be changed. Any resource requirement that was originally specified and is omitted from this string will be removed from the job step. The command will fail if you specify the same resources in both the resources and node_resources statements.
- modify_data
The new data value for modify_op.
- job_step
String representing the job step ID.
- ll_run_scheduler
This is used when the internal scheduling interval has been disabled so that an external program can control when the central manager attempts to schedule job steps. The ll_run_scheduler subroutine sends a request to the central manager to run the scheduling algorithm.
(rc, errObj) = ll_run_scheduler()
- ll_start_job_ext
Function to instruct the LoadLeveler negotiator to start a job on the specified nodes and adapters. This is meant for use by people writing external schedulers.
rc = ll_start_job_ext( step_id, node_list, adapter_list )
-
Parameters
List of node names where the job will be started.
The first member of the list is the parallel master node.
- step_id
String representing the job step ID.
- node_list
List of node names where the job will be started. The first member of the list is the parallel master node.
- adapter_list
List of lists containing adapter information for each node. The members of the list are :
- dev_name
Device name of adapter to be used such as css0
- protocol
Communication protocol this usage supports. Valid values are MPI, LAPI, and MPI_LAPI.
- subsystem
Communication subsystem this usage supports. Valid values are IP or US.
- wid
For US subsystem usages, this indicates which adapter window ID to use. For IP subsystem usages, this field is ignored.
- mem
For US subsystem usages, this is the amount of adapter memory to dedicate to the adapter usage. For IP subsystem usages, this field is ignored.
Each element in the adapter_list represents one communication channel for a task If the subsystem is US (User Space), a communication channel will require a switch adapter window. Adapter windows, and User Space usages, must be specified on actual switch adapters that are only accessible if AGGREGATE_ADAPTERS=False is specified in the configuration file.The name of the schedd host.
- ll_terminate_job
Function to instruct the LoadLeveler negotiator to cancel the specified job_step.
rc = ll_terminate_job( cluster, proc, from_ host, msg )
-
Parameters
- cluster
String representing the job step ID.
- proc
String representing the job step to be cancelled.
- from_host
String representing the name of the schedd host.
- msg
String of the message via ll_get_data as to why the job was cancelled.
|