COMSOL is typically run on a single non-GPU node. COMSOL does not support GPU computing as of 2017, and we've not yet written documentation for running COMSOL across multiple compute nodes.
Refer to the Slurm Quick Start User Guide for more information on Slurm scripts.
Table of Contents
Single-computer COMSOL Job
Start by creating a COMSOL
.mph file. For this example, we'll use the Application Library included with COMSOL 5.2a to create a new instance of the
cylinder flow model in the COMSOL Multiphysics / Fluid Dynamics category.
COMSOL Application Library (MacOS)
COMSOL Application Library (Windows)
.mph file to a writable folder, and copy it to your HPC file server space.
In the same folder you copied the
.mph file, make a SLURM job script named
An incompatibility between default COMSOL settings and the central file server means we need to make a copy of the COMSOL batch settings for this job, and store a few files on disk space local to the compute node.
Submit the job from the login node with the command sbatch
While the job is running, you'll find a file
cylinder_flow_solved.mph.status that indicates if the job is running or complete, and you can monitor the
slurm-JOBID.out file for the solution's progress in more detail.
When the job completes, you should have the
cylinder_flow_solved.mph file in the current directory. The cylinder_flow_solved.mph.status file will look something like
slurm-JOBID.out file will look something like
cylinder_flow_solved.mph back to your local computer and open it up in COMSOL. The figures below show the velocity at the last time step of the simulation.
Solved COMSOL file (MacOS)
Solved COMSOL file (Windows)
COMSOL Parallel Performance
For this particular CFD model, COMSOL shows consistent speed improvements up to 8 cores (3.3× improvement), but adding more cores show substantially diminished returns (3.7‑3.8× improvement when using 16‑28 cores). Other models may scale better or worse, depending on model size and type of computation required.
Recovering from a Terminated Job
By default, COMSOL periodically checkpoints its solution to safeguard against a program crash, or in case the job is terminated for exceeding its time limit. These checkpoint files are easily recovered when running the COMSOL solution on your local computer, but using checkpoint files from the HPC is a bit more involved. You can simulate a terminated job by submitting the previous COMSOL job to the HPC and requesting 1 core for 2 minutes with the
#SBATCH --ntasks-per-node=1 and
#SBATCH --time=00:02:00 directives in the job script (the cylinder flow job should take around 7-8 minutes on a single core job).
When the job terminates, examine the
comsol.recoveries file stored in the
.comsol/v52a folder in your home directory. It should contain one line for each terminated solution file:
Each of these lines is actually a folder, not a file. Copy the necessary folders to your local computer, into either
/Users/yourusername/Library/Preferences/COMSOL/v52a/recoveries on MacOS, or
c:\users\yourusername\.comsol\v52a\recoveries on Windows).
Copy of recoveries folder (MacOS)
Copy of recoveries folder (Windows)
Edit the comsol.recoveries file in a plain-text editor (Notepad, Notepad++, TextMate, vi, nano, etc.). This file is located in
/Users/yourusername/Library/Preferences/COMSOL/v52a on MacOS, or in
c:\users\yourusername\.comsol\v52a on Windows. Add a line pointing to the .mph folder in your local computer's recoveries folder, save the file, and edit your text editor:
Edited comsol.recoveries file (MacOS)
Edited comsol.recoveries file (Windows)
Finally, start COMSOL on your local computer, and select the File / Open Recovery File menu. Select the recovery file from the list, and examine the results from before the job terminated:
Recovery File dialog and intermediate COMSOL result (MacOS)
Recovery File dialog and intermediate COMSOL result (Windows)