HPC Slurm Job Script: Gromacs
In this page, we have given a number of slurm sample scripts, which can be consider as a guide for the hpc job submission. In many open source cluster management systems, Slurm is a highly scalable job scheduling system for a wide range of Linux clusters. Before using any of the sample script, one need to understand each and every line of the sbatch directives.
​
Important Commands:
​
1. sinfo: command gives the state of job scheduling partitions and computing nodes.
​
Example:
$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
debug up 30:00 1 idle node [1-2]
batch* up 3-00:00:00 1 idle node [3-10]
​
$ sinfo -N -l
Sat Jun 13 20:37:23 2020
NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE
REASON
node [1-2] 1 debug idle 8 1:4:2 1 0 1 (null) none
node [3-10] 1 batch* idle 8 1:4:2 1 0 1 (null) none
​
2. squeue: command shows the status of the scheduled jobs.
​
Example:
$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
2138 batch gromacs_ sudip PD 0:00 1 (Resources)
2137 batch gromacs_ sudip R 0:37 8 node [3-10]
​
3. scancel: command is used to cancel a running or pending job.
​
4. scontrol: command is used to control the slurm state, which is an administrative tool.
​
Example:
$scontrol show node: The slurm state of individual node.
​
$scontrol show node
NodeName=water Arch=x86_64 CoresPerSocket=12
CPUAlloc=0 CPUErr=0 CPUTot=48 CPULoad=10.43
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=(null)
NodeAddr=water NodeHostName=water Version=17.11
OS=Linux 5.3.0-59-generic #53~18.04.1-Ubuntu SMP Thu Jun 4 14:58:26 UTC 2020
RealMemory=1 AllocMem=0 FreeMem=3762 Sockets=2 Boards=1
State=IDLE ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
Partitions=debug,batch
BootTime=2020-06-13T20:17:09 SlurmdStartTime=2020-06-13T20:18:00
CfgTRES=cpu=8,mem=1M,billing=8
AllocTRES=
CapWatts=n/a
CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
​
​
$scontrol show partition: is used to get the more detailed information about job scheduling partition.
​
$scontrol show partition
​
PartitionName=debug
AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
AllocNodes=ALL Default=NO QoS=N/A
DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
MaxNodes=UNLIMITED MaxTime=00:30:00 MinNodes=1 LLN=NO
MaxCPUsPerNode=UNLIMITED
Nodes=water
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
OverTimeLimit=NONE PreemptMode=OFF
State=UP TotalCPUs=8 TotalNodes=1 SelectTypeParameters=NONE
DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED
​
PartitionName=batch
AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
AllocNodes=ALL Default=YES QoS=N/A
DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
MaxNodes=UNLIMITED MaxTime=3-00:00:00 MinNodes=1 LLN=NO
MaxCPUsPerNode=UNLIMITED
Nodes=water
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
OverTimeLimit=NONE PreemptMode=OFF
State=UP TotalCPUs=8 TotalNodes=1 SelectTypeParameters=NONE
DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED
​
Sample Job Scripts:
​
Job submission script is the file to use for the specified program submission and ask for resources from scheduler. Here, we have given several slurm script for job submission.
​
A. slurm script for non-GPU gromacs job submission in a single node for no longer than 30 minutes.
​
gmx_multithread.sh
​
#!/bin/bash
source ~/gmx_5-1-4.env
#SBATCH –nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --time=00:30:00
#SBATCH --partition=debug
#SBATCH --job-name=test-job
#SBATCH –output=test-job1.out
​
echo host= `hostname`
echo `date`
export omp_threads=$SLURM_CPUS_PER_TASK
gmx grompp -f nvt.mdp -c em.gro -p topol.top -o nvt.tpr
gmx mdrun -nt $omp_threads -v -deffnm nvt
echo `date`
Job submission command:
$sbatch gmx_multithread.sh
B. slurm script for gpu enable gromacs job in a single node for 10 hours
gmx_cpugpu.sh
#!/bin/bash
source ~/gmx_2018-3.env
#SBATCH --job-name=ctab-0.0mM-01
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=40
#SBATCH –gres=gpu:TitanRTX:1
#SBATCH --time=10:00:00
#SBATCH --partition=batch
#SBATCH --output=ctab-0.0mM-01.out
#SBATCH --error=ctab-0.0mM-01-err.out
echo Host = `hostname`
echo Start = `date`
gmx grompp -f nvt.mdp -c em.gro -p topol.top -o nvt.tpr
srun --accel-bind=g gmx mdrun -v -deffnm nvt
echo End = `date`
Job submission command for specified partition:
$sbatch --partition=batch gmx_cpugpu.sh
gmx_cpugpu.sh
#!/bin/bash
#SBATCH --job-name=ctab-50.0mM-01
#SBATCH --nodes=1
#SBATCH --ntasks=2
#SBATCH --cpus-per-task=48
#SBATCH --gres=gpu:tesla:2
#SBATCH --time=00:30:00
#SBATCH --partition=debug
#SBATCH --output=ctab-50.0mM-01.out
#SBATCH --error=ctab-50.0mM-01-err.out
​
echo Host = `hostname`
echo Start = `date`
gmx grompp -f nvt.mdp -c em.gro -p topol.top -o nvt.tpr
srun --mpi=mpirun --accel-bind=g gmx mdrun -v -deffnm nvt
echo End = `date`
Job submission command :
$sbatch gmx_cpugpu.sh
C. slurm script for gpu enable gromacs job with memory allocation and email notification:
gmx_cpugpu_mem.sh
#!/bin/bash
#SBATCH --job-name=ctab-50.0mM-01
#SBATCH --account=water
#SBATCH –mail-user=water@cup.edu.in
#SBATCH --nodes=1
#SBATCH --ntasks=2
#SBATCH --cpus-per-task=48
#SBATCH --gres=gpu:tesla:2
#SBATCH --time=00:30:00
#SBATCH --qos=water
#SBATCH --partition=debug
#SBATCH --output=ctab-50.0mM-01.out
#SBATCH --error=ctab-50.0mM-01-err.out
#SBATCH --mem-per-cpu=1000mb
echo Host = `hostname`
echo Start = `date`
gmx grompp -f nvt.mdp -c em.gro -p topol.top -o nvt.tpr
srun --mpi=mpirun --accel-bind=g gmx mdrun -v -deffnm nvt
echo End = `date`
Job submission command :
$sbatch gmx_cpugpu_mem.sh
​