Using local scratch space for I/O intensive jobs

Page

On some of the pronto nodes, a temp directory called /scratch is available to you.

Please note that the /scratch directory is only available on the following nodes, so you will need to request one of these in your slurm script, e.g. --nodelist=biocrunch[4-8]

  • biocrunch4-8
  • speedy3-6
  • bigram1-2
  • gpu03

 

The /scratch directory is only available locally on these nodes, and it cannot be accessed on other nodes. In this guide, we will learn how to create a script that utilizes the /scratch space for I/O jobs. We will have it create a directory under /scratch, copy any required files from your working directory for the job into /scratch, run the job, and copy the necessary output files into your working directory. Once the job is done, we will delete our files from the /scratch directory. 

 

If you are unfamiliar with how to submit a batch script or how to create a slurm job array script, please refer to these guides below:

How to create a slurm job array script (for creating scripts)

Slurm basics (for submitting batch scripts)

 

Below is an example script:

#!/bin/bash
#SBATCH --nodes=1 # request one node
#SBATCH --cpus-per-task=1  # ask for 1 cpu
#SBATCH --mem=1G # Maximum amount of memory this job will be given, try to estimate this to the best of your abilit$
#SBATCH --time=0-00:30:00 # ask that the job be allowed to run for 30 minutes.
#SBATCH --array=1-2

# everything below this line is optional, but are nice to have quality of life things
#SBATCH --output=job.%J.out # tell it to store the output console text to a file called job.<assigned job number>.o$
#SBATCH --error=job.%J.err # tell it to store the error messages from the program (if it doesn't write them to norm$
 

mkdir -p /scratch/jones/documentation/$SLURM_ARRAY_TASK_ID #making the directory here in /scratch
cp -r /work/LAS/jones-lab/scratchSpace /scratch/jones/documentation/$SLURM_ARRAY_TASK_ID #copying from work to scratch
cd /scratch/jones/documentation/$SLURM_ARRAY_TASK_ID #going into the scratch directory
echo "My SLURM_ARRAY_TASK_ID: " $SLURM_ARRAY_TASK_ID >> test.txt  #making a sample .txt file
cp -r /scratch/jones/documentation/$SLURM_ARRAY_TASK_ID /work/LAS/jones-lab/scratchSpace #copying from /scratch to /work
rm -rf /scratch/jones/ #this just cleans up the directory we created in /scratch

 

Transfer your script via SCP or WinSCP (which is available in the software center). Connect to one of the nodes listed above and then run:

$ sbatch myscript.sh

 

If you were running the sample script above, it would produce .out and .err files in the same directory as the script. However, the results of the script (in this case, the slurm task ID in a text file) are stored in the /work/LAS/jones-lab folder. If we were to check the /scratch directory, there would be no folders/files since we had the script delete the directory. In the /work/LAS/jones-lab folder, we can see that we have two folders:

scratch_folders

To verify the contents of the folder, you can navigate into the folder named "1" and check the .txt file.

verify_scratch

As you can see, we have successfully printed the task ID to a text file called test.txt, as specified in the script.