Comparison Tables
This document provides basic resource specifications and batch options for the LoadLeveler (Champion), LSF (Lonestar), and SGE (Ranger and Stampede) batch utilities, compares them, and lists them side-by-side. While there are no TACC systems that currently use the PBS batch system, commands for that batch system are included to help users migrate from sites that use PBS to one of the batch systems used at TACC.
Table 1 compares the resource syntax of each batch system for the most commonly employed user specifications. Examples of these specifications include the total number of nodes and tasks per node, the wall clock time, and the peak memory usage per task. Additional specifications include the delineation of a specific "class" or "queue" to designate a relative priority in advancing through the queue structure and an email request for notifying the user at the beginning or end of a given job execution.
Table 2 provides a list of important environment variables under each batch system, and Table 3 compares the relevant resource management commands to submit, monitor, and cancel queued jobs. As a final comparison between the batch systems, example submission scripts are provided for each of the three batch systems to request comparable resources and run a given parallel executable named mpihello.
| Table 1 Important Resource Syntax for LoadLeveler, PBS, LSF, and SGE. | ||||
| Utility | LoadLeveler (LL) | PBS | LSF | SGE |
| Resource Sentinel | # @ | #PBS | #BSUB | #$ |
| Nodes/Processors | node = < # > tasks_per_node = < # > |
-l nodes=< # >:ppn=< # > (ppn = proc. per node) |
-n < # > | -pe < # >wayness < #cc > (wayness=cores/node) (#cc=core count) |
| Wall Clock Limit | wall_clock_limit= [dd:]hh:mm:ss | -l walltime =hh:mm:ss | -W hh:mm | -l h_rt=hh:mm:ss |
| Queue | Class = < queue > | -q < queue > | -q < queue > | -q < queue > |
| notification =always| error| start| never| complete | -me | B (sends mail when job begins execution) -N (sends job report by mail when job finishes) |
-m be (sends mail when job begins/ends execution) | |
| email address | notify_user=< email > | -M < email_address > | -u < email_address > | -M < email_address > |
| Initial Directory | initialdir=< directory > | (default = $HOME) | (default = job submission directory) |
(default = $HOME) |
| Job Name | job_name=< name > | -N < name > | -J < name > | -N < name > |
| STDERR & STDOUT to same file | output = < file > error = $(output) |
-j oe | (use -o without -e) | -j y |
| Project to charge | account_no=< project > | -P < project > | -A < project > | |
| Table 2 Important Environment Variables | ||||
| LoadLeveler | PBS | LSF | SGE | |
| Processor List | $LOADL_PROCESSOR_LIST | cat -n $PBS_NODEFILE | $LSB_HOSTS | (not available) |
| Submission Directory | $LOADL_STEP_INITDIR | $PBS_O_WORKDIR | $LS_SUBCWD | #$ -V or $SGE_O_WORKDIR |
| Job ID | $LOADL_STEP_ID | $PBS_JOBID | $LSB_JOBI | $JOB_NAME |
| Table 3 Queue Management Commands for Each System | ||||
| Purpose | LoadLeveler | PBS | LSF | SGE |
| Submission | llsubmit job | qsub job | bsub < job | qsub job |
| Deletion | llcancel | qdel | bkill | qdel |
| Status | llq | qstat | bjobs | qstat |
| List Queue | llclass | qstat -Q | bqueues -l | qconf sql |
| GUI Monitor | xloadl | xpbsmon | (not available) | (not available) |
Example Batch Scripts
You can find example job scripts for each batch system below. All scripts specify the same resources and run the same parallel executable:
Example PBS job script
In the example below, the environment variables PBS_O_HOST, PBS_NODEFILE, and PBS_O_WORKDIR contain the master host, list of assigned compute nodes, and the directory of submission, respectively. Mpirun is used to launch the parallel applications on 16 processors (the "-np" argument).
| #!/bin/csh #PBS -l nodes=8:ppn=2 #PBS -l walltime=6:00:00 #PBS -q normal #PBS -N hello #PBS -j oe #PBS -me -M This e-mail address is being protected from spambots. You need JavaScript enabled to view it echo "Master Host: $PBS_O_HOST" echo "Nodes:"; cat -n $PBS_NODEFILE; echo "" echo "-----------------------------------------------" cd $PBS_O_WORKDIR mpirun -np 16 ./mpihello |
Example LSF job script
In the LSF example below, the "%J" expression is evaluated as the job name by the LSF interpreter. The environment variables LSB_HOSTS and LS_SUBCWD contain the list of assigned compute nodes and the submission directory, respectively.
| #!/bin/csh #BSUB -n 16 #BSUB -W 6:00 #BSUB -q normal #BSUB -J hello #BSUB -o out.o%J #BSUB -u This e-mail address is being protected from spambots. You need JavaScript enabled to view it echo "Master Host: `hostname` " echo "Node List: $LSB_HOSTS " cd $LS_SUBCWD ibrun ./mpihello |
LoadLeveler example job script
In the following LoadLeveler example, the "environment" keyword provides a list of colon separated environment variable values (variable_name=variable_value).
Note: The environment resource specification must be on a single line (the expression used below is wrapped only for display). Setting COPY_ALL (without a value) signals LoadLeveler to copy all of your interactive variables to the batch environment. The MP_EUILIB=us and MP_SHARED_MEMORY ensure the correct software (user space) and shared memory mpi buffers for MPI, respectively. The network.MPI resources (csss,shared,US) specifies the SP2 dual-plane adapters, shared memory, and "us" software stack, respectively.
The $LOADL_PROCESSOR_LIST and $LOADL_STEP_INITDIR environment variables contain the list of processors and the submission directory. The "poe" command is used to launch the parallel applications on 16 processors (node and tasks_per_node are used to determine the number of processors). For code compiled with "MP" compilers (mpxlf90, mpcc, etc.) the "poe" is not necessary. Using hpmcount in lieu of the poe will provide hardware counter information for the parallel execution.
| #!/usr/bin/csh # # @ environment = COPY_ALL;MP_EUILIB=us;MP_INTRDELAY=100; XLSMPOPTS=parthds=1;SPINLOOPTIME=10000; YIELDLOOPTIME=10000;MP_CPU_USE=multiple; MP_SHARED_MEMORY=yes;MP_INTRDELAY=100 # @ node = 4 # @ tasks_per_node = 4 # @ resources = ConsumableCpus(1) ConsumableMemory(1800MB) # @ wall_clock_limit = 06:00:00 # @ class = normal # @ job_name = hello # @ output = $(job_name).o$(jobid) # @ error = $(job_name).o$(jobid) # @ notification = never # @ network.MPI = csss,shared,US # @ job_type = parallel # @ notification=never # @ notify_user = This e-mail address is being protected from spambots. You need JavaScript enabled to view it # @ queue echo "Master Host: `pwd`" echo "NODELIST: $LOADL_PROCESSOR_LIST" echo "----------------------------------" cd $LOADL_STEP_INITDIR poe ./mpihello > |
Example SGE job script
| #!/bin/bash | |
| #$ -V | # Inherit the submission environment |
| #$ -cwd | # Start job in submission directory |
| #$ -N myMPI | # Job Name |
| #$ -j y | # Combine stderr and stdout |
| #$ -o ${JOB_NAME}.o${JOB_ID} | # Name of the output file (eg. myMPI.oJobID) |
| #$ -pe 16way 32 | # Requests 16 tasks/node, 32 cores total |
| #$ -q normal | # Queue name "normal" |
| #$ -l h_rt=01:30:00 | # Run time (hh:mm:ss) - 1.5 hours |
| #$ -M | # Use email notification address |
| #$ -m be | # Email at Begin and End of job |
| set -x | # Echo commands, use "set echo" with csh |
| ibrun ./mpihello | # Run the MPI executable named "mpihello" |
The "$JOB_NAME" variable is allowed in the batch resource specification in SGE. SGE launches 16 executables on each node (16way) and uses the core count (32) to determine the number of nodes; hence the core count must be divisible by 16 at TACC. If the number of tasks to be launched is not divisible by 16, then set the MY_NSLOTS environment variable to the number of tasks, and set the core count to the next number divisible by 16 (in order the get the correct number of nodes).
Note: "ibrun" is a wrapper script at TACC for using an InfiniBand aware MPI.


