Skip to content

Job Arrays

A job array lets you submit one job that expands into many identical tasks (array elements).

Each element runs the same script but with a different array index, making it perfect for embarrassingly-parallel workloads such as file-by-file processing.

Example

#!/bin/bash
#PBS -J 1-10                          # Create a job array with 10 tasks
#PBS -l select=1:ncpus=2:mem=8gb      # Request resources
#PBS -l walltime=01:00:00             # Set a maximum wall

module load mysoftware   # Load any necessary modules

# Run the application, using the current PBS_ARRAY_INDEX 
# to select a specific input file. Each array task runs 
# independently and in parallel, processing a different file.
./my_program mydata/$PBS_ARRAY_INDEX/myfile.dat 

This example job array script runs 10 parallel tasks, each requesting 2 CPU cores and 8 GB of memory for up to 1 hour, resulting in a total allocation of up to 20 CPU cores and 80 GB of memory across the cluster, with each task processing a different input file based on its array index.

Environment Variables

When job launches each array element, it automatically sets a handful of environment variables that your script can query.

Variable What it holds
PBS_ARRAY_INDEX The element’s numeric index.
PBS_JOBID / PBS_JOBID_ARRAY Job ID of the parent array; element IDs look like 1234[7].
PBS_NODEFILE Hostfile listing the node(s) allocated to this element.
PBS_O_WORKDIR Directory where you ran qsub.

Array Directive

The -J option is what turns a single job submission into a job array.

Directive Task indices created
-J 1-10 1 2 3 4 5 6 7 8 9 10
-J 0-99 0 1 2 … 99
-J 1-100:5 1 6 11 … 96 (step = 5)
-J 1-5,20-25 1 2 3 4 5 20 21 22 23 24 25

Monitoring Array Jobs

# Show all elements (running, queued, finished)
qstat -t

# Detailed info for an element
qstat -f 1234[7]

# Post-mortem trace (after completion)
tracejob 1234[7]