Research Computing | Condo2017 Scheduler

Video
Transcript or Alternate URL: 

This video will cover the differences in the scheduler between condo and condo2017.  Condo2017runs a newer job scheduler called SLURM.  The previous version of Condo uses a scheduler commonly known as PBS (specifically, Condo uses Torque/Maui).The new SLURM scheduler provides greater flexibility and control over the job schedules, and has been widely adopted on large clusters at other major institutions. Becoming familiar with the SLURM scheduler will be beneficial if you run jobs on other clusters.Compatibility with PBS:SLURM includes a compatibility layer for PBS scripts to allow for an easy transition from PBS to SLURM.  We have tested a variety of PBS scripts with SLURM, and had mixed success.  You can attempt to submit your PBS scripts toSLURM unchanged, but if you run into issues, it is best to rewrite your job script in the SLURM format rather than attempt to troubleshoot the PBS issues.There is also a utility to convert PBS scripts to SLURM, but you may need to make some manual edits after conversion.  The conversion utility is called 'pbs2sbatch' and is installed on Condo2017.Writing SLURM scripts:A key concept to keep in mind when writing scheduler scripts is that you're asking the scheduler to make you a reservation.  The scheduler has no knowledge of what your program is going to actually do, or how long it will run.  The submission script is simply asking the scheduler to make you a reservation for the appropriate amount of time and compute nodes based on your best estimate.  Ifyou underestimate the amount of time required, your job will be terminated early.The format of simple SLURM job scripts is straight forward, and should look familiar to you if you have experience with PBS.  The key elements you need in your SLURM script for Condo2017 are:Time (how long you are requesting for your job to be able to run)Nodes (how many servers your job should be spread across) -your code needs to be capable of node to node communication (Iikely via MPI) for more than one node to be beneficial to you.Modules to loadThe command to run your programA simple submission script (myjob.sh) will look something like:#!/bin/bash #SBATCH --nodes 1 #SBATCH --time=00:30:00#SBATCH --error=job.%J.err # tell it to store the output console text to afile#SBATCH --output=job.%J.out #tell it to store the error messages to a filemodule purge
module load Rsrun Rscript myscript.RTo submit your job, simply run:$ sbatch myjob.shSLURM Script Writer:The HPC team has created a web based script writer to help create basic scripts for users who are new to SLURM.  You can input your parameters, and the script writer will output your job script for you.http://www.hpc.iastate.edu/guides/condo2017/slurm-job-script-writerClosi... has been a review ofthe scheduler on the condo2017 cluster.  For more information on SLURM, please review the following links: https://researchit.las.iastate.edu/slurm-basics http://gif.biotech.iastate.edu/slurm-slurm-job-management-cheat-sheet

Category: 
Tags: