ITS is actively responding to the COVID-19 situation and making resources available for you. Learn more here.

Page tree

Overview

Gromacs can be run using four main classes of resources:

  • small jobs using CPU cores in a single computer,
  • small jobs using CPU cores and 1-2 GPUs in a single computer,
  • large jobs using CPU cores in multiple computers connected over a network, and
  • large jobs using CPU cores and 1-2 GPUs per computer in multiple computers connected over a network.

Input files for all the examples below are from the HECBioSim project's benchmark suite.

Refer to the Slurm Quick Start User Guide for more information on Slurm scripts.


Single-computer, non-GPU Gromacs Job

Working from the HECBioSim 20k atom model, create a Slurm job script named gromacs_multicore.sh to run the job:

gromacs_multicore.sh
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --cpus-per-task=28

INPUT=bench.tpr
OUTPUT=bench.log
module load gromacs
gmx mdrun -nt ${SLURM_CPUS_PER_TASK} -s ${INPUT} -g ${OUTPUT}

Submit the job from the login node with the command sbatch gromacs_multicore.sh, and when the job completes, you should have several new files and bench.log, containing the Gromacs output.

Final directory listing for Gromacs multicore job
[renfro@login GROMACS]$ ls -lt
total 4196180
-rw------- 1 renfro domain users    3741 Oct 16 11:56 slurm-4844.out
-rw------- 1 renfro domain users   35271 Oct 16 11:56 bench.log
-rw------- 1 renfro domain users 2589180 Oct 16 11:56 traj.trr
-rw------- 1 renfro domain users   31284 Oct 16 11:56 ener.edr
-rw------- 1 renfro domain users 1352816 Oct 16 11:56 confout.gro
-rw------- 1 renfro domain users  472364 Oct 16 11:56 state.cpt
-rw------- 1 renfro domain users     168 Oct 16 11:53 gromacs_multicore.sh
-rw-r--r-- 1 renfro domain users  188941 Nov 21  2011 3NIR.top
-rw------- 1 renfro domain users    1247 Nov 14  2011 _bench.mdp
-rw-r--r-- 1 renfro domain users 1238752 Nov 14  2011 bench.tpr

Single-computer, GPU-enabled Gromacs Job

Working from the HECBioSim 20k atom model, edit the file _bench.mdp to adding a line ensuring that Gromacs uses the verlet cutoff scheme:

Section of Gromacs input file
cutoff-scheme            = verlet

and changing the line:

Section of Gromacs input file
rcoulomb                 = 1.4

to:

Section of Gromacs input file
rcoulomb                 = 1.2

to match the rvdw setting.

Regenerate the Gromacs .tpr file from the HECBioSim topology file and your updated molecular dynamics parameters file:

gmx grompp -f _bench.mdp -c bench.tpr -p 3NIR.top -o bench_gpu.tpr

Create a Slurm job script named gromacs_gpu.sh to run the job (the --cpus-per-task flag has been adjusted from 28 to 16 as one solution to how Gromacs will divide up the job):

gromacs_gpu.sh
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --cpus-per-task=16
#SBATCH --gres=gpu:2

INPUT=bench_gpu.tpr
OUTPUT=bench_gpu.log
module load cuda80/toolkit gromacs
gmx mdrun -nt ${SLURM_CPUS_PER_TASK} -s ${INPUT} -g ${OUTPUT}

Submit the job from the login node with the command sbatch gromacs_gpu.sh, and when the job completes, you should have several new files and bench_gpu.log, containing the Gromacs output.

Final directory listing for Gromacs GPU job
[renfro@login GROMACS]$ ls -lt
total 5245440
-rw------- 1 renfro domain users    4220 Oct 16 13:48 slurm-4851.out
-rw------- 1 renfro domain users   33926 Oct 16 13:48 bench_gpu.log
-rw------- 1 renfro domain users 2589180 Oct 16 13:48 traj.trr
-rw------- 1 renfro domain users   31284 Oct 16 13:48 ener.edr
-rw------- 1 renfro domain users 1352816 Oct 16 13:48 confout.gro
-rw------- 1 renfro domain users  472372 Oct 16 13:48 state.cpt
-rw------- 1 renfro domain users     235 Oct 16 13:47 bench_gpu.sh
-rw------- 1 renfro domain users  689284 Oct 16 13:46 bench_gpu.tpr
-rw------- 1 renfro domain users   11942 Oct 16 13:46 mdout.mdp
-rw------- 1 renfro domain users    1281 Oct 16 13:36 _bench.mdp
-rw-r--r-- 1 renfro domain users  188941 Nov 21  2011 3NIR.top
-rw-r--r-- 1 renfro domain users 1238752 Nov 14  2011 bench.tpr

Multi-computer, non-GPU Gromacs Job

Working from the HECBioSim 20k atom model, create a Slurm job script named gromacs_mpi.sh to run the job:

gromacs_mpi.sh
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=4

INPUT=bench_gpu.tpr
OUTPUT=bench_mpi.log
module load gromacs
mpirun `which mdrun_mpi` -ntomp ${SLURM_CPUS_PER_TASK} -s ${INPUT} -g ${OUTPUT}

Submit the job from the login node with the command sbatch gromacs_mpi.sh, and when the job completes, you should have several new files and bench_mpi.log, containing the Gromacs output.

Final directory listing for Gromacs MPI job
[renfro@login GROMACS]$ ls -lt
total 4196180
-rw------- 1 renfro domain users    3960 Oct 16 14:49 slurm-4859.out
-rw------- 1 renfro domain users   35962 Oct 16 14:49 bench_mpi.log
-rw------- 1 renfro domain users   31284 Oct 16 14:49 ener.edr
-rw------- 1 renfro domain users 2589180 Oct 16 14:49 traj.trr
-rw------- 1 renfro domain users 1352816 Oct 16 14:49 confout.gro
-rw------- 1 renfro domain users  472380 Oct 16 14:49 state.cpt
-rw------- 1 renfro domain users     167 Oct 16 14:48 bench_mpi.sh
-rw------- 1 renfro domain users    1247 Oct 16 13:53 _bench.mdp
-rw-r--r-- 1 renfro domain users  188941 Nov 21  2011 3NIR.top
-rw-r--r-- 1 renfro domain users 1238752 Nov 14  2011 bench.tpr

Multi-computer, GPU-enabled Gromacs Job

Working from the HECBioSim 20k atom model, edit the file _bench.mdp to adding a line ensuring that Gromacs uses the verlet cutoff scheme:

Section of Gromacs input file
cutoff-scheme            = verlet

and changing the line:

Section of Gromacs input file
rcoulomb                 = 1.4

to:

Section of Gromacs input file
rcoulomb                 = 1.2

to match the rvdw setting.

Regenerate the Gromacs .tpr file from the HECBioSim topology file and your updated molecular dynamics parameters file:

gmx grompp -f _bench.mdp -c bench.tpr -p 3NIR.top -o bench_mpi_gpu.tpr

Create a Slurm job script named gromacs_mpi_gpu.sh to run the job (the --cpus-per-task flag has been adjusted from 28 to 8 as one solution to how Gromacs will divide up the job):

gromacs_mpi_gpu.sh
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=8
#SBATCH --gres=gpu:2

INPUT=bench_gpu.tpr
OUTPUT=bench_mpi_gpu.log
module load cuda80/toolkit gromacs
mpirun `which mdrun_mpi` -ntomp ${SLURM_CPUS_PER_TASK} -s ${INPUT} -g ${OUTPUT}

Submit the job from the login node with the command sbatch gromacs_mpi_gpu.sh, and when the job completes, you should have several new files and bench_mpi_gpu.log, containing the Gromacs output.

Final directory listing for Gromacs MPI GPU job
[renfro@login GROMACS]$ ls -lt
total 4196868
-rw------- 1 renfro domain users    4082 Oct 17 11:00 slurm-4888.out
-rw------- 1 renfro domain users   32963 Oct 17 11:00 bench_mpi_gpu.log
-rw------- 1 renfro domain users 2589180 Oct 17 11:00 traj.trr
-rw------- 1 renfro domain users   31284 Oct 17 11:00 ener.edr
-rw------- 1 renfro domain users 1352816 Oct 17 11:00 confout.gro
-rw------- 1 renfro domain users  472388 Oct 17 11:00 state.cpt
-rw------- 1 renfro domain users     479 Oct 17 10:54 gromacs_mpi_gpu.sh
-rw------- 1 renfro domain users  689284 Oct 16 14:56 bench_mpi_gpu.tpr
-rw------- 1 renfro domain users   11946 Oct 16 14:56 mdout.mdp
-rw------- 1 renfro domain users    1281 Oct 16 14:56 _bench.mdp
-rw-r--r-- 1 renfro domain users  188941 Nov 21  2011 3NIR.top
-rw-r--r-- 1 renfro domain users 1238752 Nov 14  2011 bench.tpr

Improving Performance for Gromacs Jobs

Please read the Gromacs documentation section Getting good performance from mdrun for more information about benchmarking Gromacs and improving your job runtimes.

Though it's assumed that adding more processors to a given job will automatically increase performance, various model sizes will reach peak performance with different processor counts. Additionally, simply adding minimal GPU support to an input file might not improve performance without more extensive changes.


How helpful was this information?

Your Rating: Results: 1 Star2 Star3 Star4 Star5 Star 90 rates