Slurm basics

Main Slurm Commands

  • sbatch - submit a job script.
  • srun - run a command on allocated compute node(s).
  • scancel - delete a job.
  • squeue - show state of jobs.
  • sinfo - show state of nodes and partitions (queues).
  • smap - show jobs, partitions and nodes in a graphical network topology.

SBatch

The sbatch command submits a batch processing job to the slurm queue manager. These scripts typically contain one or more srun commands to queue jobs for processing.

SRun

The srun command is used to submit jobs for execution, or to initiate steps of jobs in real time. For the full range of options that can be passed to the srun command, see the UNIX man page for srun (type man srun at the command prompt).

SCancel

The scancel command will terminate pending and running job steps. You can also use it to send a unix signal to all processes associated with a running job or job step.

SQueue

The squeue command will report the state of running and pending jobs.

SInfo

The sinfo command will report the status of the available partitions and nodes.

SMap

The smap command is similar to the sinfo command, except it displays all of the information in a pseudo-graphical, ncurses terminal.

Example Scripts

Script 1

The following snippet runs a program asking for four (4) tasks.

#!/bin/bash
srun -n 4 my_program

Script 2

This script is the same as Script 1 except it uses slurm directives instead of passing the arguments as part of the srun command.

#!/bin/bash
#SBATCH -n 4
#SBATCH --ntasks-per-node=2
#SBATCH --time=00:30:00

srun ./my_program

To submit the script, just run

$> sbatch jobscript

Script 3

Running two jobs per node:

#!/bin/bash
#SBATCH -N 1
#SBATCH -n 2
#SBATCH --time=00:30:00

# Use '&' to move the first job to the background
srun -n 1 ./job1.batch &
srun -n 1 ./job2.batch

# Use 'wait' as a barrier to collect both executables when they are done.
wait

To submit the script, just run

$> sbatch jobscript

Script 4

Naming output and error files:

#!/bin/bash
#SBATCH -n 2
#SBATCH --time=00:05:00
#SBATCH --error=job.%J.err
#SBATCH --output=job.%J.out

srun ./my_program

To submit the script, just run

$> sbatch jobscript

Script 5

#!/bin/bash

#SBATCH --nodes=1 #request one node

#SBATCH --cpus-per-task=8  #ask for 8 cpus

#SBATCH --time=02:00:00 #ask that the job be allowed to run for 2 hours.

#SBATCH --error=job.%J.err # tell it to store the output console text to a file

#SBATCH --output=job.%J.out #tell it to store the error messages to a file

module load R #load the most recent version of R available

R CMD BATCH < Rscript.R #run an R script using R

To submit the script, just run

$> sbatch jobscript

Script 6

Running a job that needs a GPU

#!/bin/bash

#SBATCH --nodes=1 #request one node

#SBATCH --cpus-per-task=8  #ask for 8 cpus

#SBATCH --time=02:00:00 #ask that the job be allowed to run for 2 hours.

#SBATCH --error=job.%J.err # tell it to store the output console text to a file

#SBATCH --output=job.%J.out #tell it to store the error messages to a file

#SBATCH --gres=gpu:1 #job is asking for 1 GPU, the scheduler will ensure this job is run on a node with a GPU available

module load R #load the most recent version of R available

R CMD BATCH < Rscript.R #run an R script using R

To submit the script, just run

$> sbatch jobscript

Page
Category: