Scratch Space¶
Our cluster offers three types of high-performance storage, each dedicated to scratch space for temporary data processing.
| Type | Scheduler resource | Description | Scope |
|---|---|---|---|
| Memory | scratch_memory | In memory filesystem (tmpfs). | Node |
| Local | scratch_local | High-speed NVMe SSDs. | Node |
| Shared | scratch_shared | Clustered filesystem. | Cluster |
Note
The scratch filesystem is hard-capped at the exact size you request; if your job attempts to write more data than that quota allows, the writes will fail (e.g., ENOSPC errors) and the job may terminate.
Each job may use only one scratch storage type (e.g., scratch_local, scratch_shared, or scratch_memory); requesting or writing to more than one scratch space in the same job is not permitted and may cause the job to fail.
Warning
Scratch directories are created automatically when your job starts and are deleted immediately after the job finishes to keep the scratch space free of leftover files.
Make sure your script copies every file you need to permanent storage before it exits, because anything left in scratch will be erased.
Memory Scratch¶
Memory scratch is a temporary, in-memory filesystem created with Linux tmpfs. Because the data live entirely in RAM (and, on NUMA machines, in the node-local memory closest to each core), access times are orders of magnitude faster than to any disk-based scratch device.
Example¶
#!/bin/bash
#PBS -l select=1:ncpus=4:mem=8gb:scratch_memory=8gb
#PBS -l walltime=02:00:00
cp ${HOME}/my_data ${SCRATCH}/my_data # Copy data to scratch
cd ${SCRATCH} # Enter scratch directory
module load mysoftware # Load any necessary modules
./my_program # Run your application
cp ${SCRATCH}/my_data ${HOME}/my_result # Copy results from scratch to home
Local Scratch¶
Local scratch is a node filesystem backed by NVMe SSD(s).
Because the data stay on a very fast, directly-attached NVMe drives rather than on the shared network storage - random-I/O latency and bandwidth are dramatically better (typically 5-10 × faster).
It is ideal for large, I/O-intensive working sets that do not fit in RAM but still need high throughput.
Example¶
#!/bin/bash
#PBS -l select=1:ncpus=4:mem=8gb:scratch_local=8gb
#PBS -l walltime=02:00:00
cp ${HOME}/my_data ${SCRATCH}/my_data # Copy data to scratch
cd ${SCRATCH} # Enter scratch directory
module load mysoftware # Load any necessary modules
./my_program # Run your application
cp ${SCRATCH}/my_data ${HOME}/my_result # Copy results from scratch to home
Shared Scratch¶
The shared scratch space is a cluster-wide filesystem built on CEPH and physically stored on NVMe SSDs.
All nodes share the same scratch directory $SCRATCH, so ranks can create, read and delete each other's temporary files without extra staging.
The filesystem is stored on NVMe drives, but bandwidth and IOPS are shared among all jobs using the scratch pool.
Note
The scratch_shared is a cluster-wide resource, so request it with its own directive #PBS -l scratch_shared=8gb rather than embedding it in the #PBS -l select=1:ncpus=1:mem=8g:scratch_shared=8gb line, which is reserved for per-node resources.
Example¶
#!/bin/bash
#PBS -l select=4:ncpus=4:mem=8gb
#PBS -l scratch_shared=8gb
#PBS -l walltime=02:00:00
cd ${SCRATCH} # Enter scratch directory
module load mysoftware # Load any necessary modules
./my_program # Run your application
This example launches a two-hour job on four nodes — each with 4 CPU cores and 8 GB of RAM and allocates 8 GB shared scratch filesystem that is shared on every node, so every resource chunk can read from and write to the same temporary files across the entire cluster during the run.
Monitoring Scratch Usage¶
Below are quick-reference commands for checking how much scratch space is in use and what remain from the login node.