HPC Slurm Job Script: Gromacs
In this page, we have given a number of slurm sample scripts, which can be consider as a guide for the hpc job submission. In many open source cluster management systems, Slurm is a highly scalable job scheduling system for a wide range of Linux clusters. Before using any of the sample script, one need to understand each and every line of the sbatch directives.
Important Commands:
1. sinfo: command gives the state of job scheduling partitions and computing nodes.
Example:
$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
debug up 30:00 1 idle node [1-2]
batch* up 3-00:00:00 1 idle node [3-10]
$ sinfo -N -l
Sat Jun 13 20:37:23 2020
NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE
REASON
node [1-2] 1 debug idle 8 1:4:2 1 0 1 (null) none
node [3-10] 1 batch* idle 8 1:4:2 1 0 1 (null) none
2. squeue: command shows the status of the scheduled jobs.
Example:
$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
2138 batch gromacs_ sudip PD 0:00 1 (Resources)
2137 batch gromacs_ sudip R 0:37 8 node [3-10]
3. scancel: command is used to cancel a running or pending job.
4. scontrol: command is used to control the slurm state, which is an administrative tool.
Example:
$scontrol show node: The slurm state of individual node.
$scontrol show node
NodeName=water Arch=x86_64 CoresPerSocket=12
CPUAlloc=0 CPUErr=0 CPUTot=48 CPULoad=10.43
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=(null)
NodeAddr=water NodeHostName=water Version=17.11
OS=Linux 5.3.0-59-generic #53~18.04.1-Ubuntu SMP Thu Jun 4 14:58:26 UTC 2020
RealMemory=1 AllocMem=0 FreeMem=3762 Sockets=2 Boards=1
State=IDLE ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
Partitions=debug,batch
BootTime=2020-06-13T20:17:09 SlurmdStartTime=2020-06-13T20:18:00
CfgTRES=cpu=8,mem=1M,billing=8
AllocTRES=
CapWatts=n/a
CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
$scontrol show partition: is used to get the more detailed information about job scheduling partition.
$scontrol show partition
PartitionName=debug
AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
AllocNodes=ALL Default=NO QoS=N/A
DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
MaxNodes=UNLIMITED MaxTime=00:30:00 MinNodes=1 LLN=NO
MaxCPUsPerNode=UNLIMITED
Nodes=water
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
OverTimeLimit=NONE PreemptMode=OFF
State=UP TotalCPUs=8 TotalNodes=1 SelectTypeParameters=NONE
DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED
PartitionName=batch
AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
AllocNodes=ALL Default=YES QoS=N/A
DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
MaxNodes=UNLIMITED MaxTime=3-00:00:00 MinNodes=1 LLN=NO
MaxCPUsPerNode=UNLIMITED
Nodes=water
PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
OverTimeLimit=NONE PreemptMode=OFF
State=UP TotalCPUs=8 TotalNodes=1 SelectTypeParameters=NONE
DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED
Sample Job Scripts:
Job submission script is the file to use for the specified program submission and ask for resources from scheduler. Here, we have given several slurm script for job submission.
A. slurm script for non-GPU gromacs job submission in a single node for no longer than 30 minutes.
gmx_multithread.sh
#!/bin/bash
source ~/gmx_5-1-4.env
#SBATCH –nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --time=00:30:00
#SBATCH --partition=debug
#SBATCH --job-name=test-job
#SBATCH –output=test-job1.out
echo host= `hostname`
echo `date`
export omp_threads=$SLURM_CPUS_PER_TASK
gmx grompp -f nvt.mdp -c em.gro -p topol.top -o nvt.tpr
gmx mdrun -nt $omp_threads -v -deffnm nvt
echo `date`
Job submission command:
$sbatch gmx_multithread.sh
B. slurm script for gpu enable gromacs job in a single node for 10 hours
gmx_cpugpu.sh
#!/bin/bash
source ~/gmx_2018-3.env
#SBATCH --job-name=ctab-0.0mM-01
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=40
#SBATCH –gres=gpu:TitanRTX:1
#SBATCH --time=10:00:00
#SBATCH --partition=batch
#SBATCH --output=ctab-0.0mM-01.out
#SBATCH --error=ctab-0.0mM-01-err.out
echo Host = `hostname`
echo Start = `date`
gmx grompp -f nvt.mdp -c em.gro -p topol.top -o nvt.tpr
srun --accel-bind=g gmx mdrun -v -deffnm nvt
echo End = `date`
Job submission command for specified partition:
$sbatch --partition=batch gmx_cpugpu.sh
gmx_cpugpu.sh
#!/bin/bash
#SBATCH --job-name=ctab-50.0mM-01
#SBATCH --nodes=1
#SBATCH --ntasks=2
#SBATCH --cpus-per-task=48
#SBATCH --gres=gpu:tesla:2
#SBATCH --time=00:30:00
#SBATCH --partition=debug
#SBATCH --output=ctab-50.0mM-01.out
#SBATCH --error=ctab-50.0mM-01-err.out
echo Host = `hostname`
echo Start = `date`
gmx grompp -f nvt.mdp -c em.gro -p topol.top -o nvt.tpr
srun --mpi=mpirun --accel-bind=g gmx mdrun -v -deffnm nvt
echo End = `date`
Job submission command :
$sbatch gmx_cpugpu.sh
C. slurm script for gpu enable gromacs job with memory allocation and email notification:
gmx_cpugpu_mem.sh
#!/bin/bash
#SBATCH --job-name=ctab-50.0mM-01
#SBATCH --account=water
#SBATCH –mail-user=water@cup.edu.in
#SBATCH --nodes=1
#SBATCH --ntasks=2
#SBATCH --cpus-per-task=48
#SBATCH --gres=gpu:tesla:2
#SBATCH --time=00:30:00
#SBATCH --qos=water
#SBATCH --partition=debug
#SBATCH --output=ctab-50.0mM-01.out
#SBATCH --error=ctab-50.0mM-01-err.out
#SBATCH --mem-per-cpu=1000mb
echo Host = `hostname`
echo Start = `date`
gmx grompp -f nvt.mdp -c em.gro -p topol.top -o nvt.tpr
srun --mpi=mpirun --accel-bind=g gmx mdrun -v -deffnm nvt
echo End = `date`
Job submission command :
$sbatch gmx_cpugpu_mem.sh
