Skip to content

Document running MPI simulations with dump/restore#18

Open
VatsalSy wants to merge 1 commit into
mainfrom
docs/mpi-dump-restore-tips
Open

Document running MPI simulations with dump/restore#18
VatsalSy wants to merge 1 commit into
mainfrom
docs/mpi-dump-restore-tips

Conversation

@VatsalSy
Copy link
Copy Markdown
Member

Summary

Adds a Tips.md section, Running MPI Simulations with dump / restore, covering the parallel checkpoint/restart workflow used across CoMPhy Lab drop/bubble codes (the two-phase init → main pattern).

Written up from debugging a Snellius parameter sweep. It documents:

  • Compile flagsCC99='mpicc -std=c99 -D_GNU_SOURCE=1' qcc -Wall -O2 -D_MPI=1 -disable-dimensions. -D_GNU_SOURCE=1 is required: recent Basilisk (grid/memindex/virtual.h) uses MAP_ANONYMOUS/MADV_DONTNEED and the build fails without it. Also notes the serial-only init phase and the qcc ../-path crash.
  • HPC modules — compile-environment and run-environment must match (Snellius 2024 + OpenMPI/5.0.3-GCC-13.3.0 example); a toolchain mismatch can segfault inside the MPI restore path.
  • dump/restore across process countsrestore() redistributes the tree over the current npe(); a geometry-only dumpInit is reusable across a parameter sweep; restore_mpi/mpi_boundary_update startup segfaults are almost always an environment mismatch.
  • Sweep pattern — init once → stage the shared dumpInit as dump per case → one MPI job per case.

Doc-only change; placed between "Installing Basilisk" and "Patches".

Add a Tips.md section covering parallel runs that checkpoint with dump()
and restart with restore(): the required MPI compile flags
(CC99='mpicc -std=c99 -D_GNU_SOURCE=1', -D_MPI=1), matching the compile
and run toolchains, dump/restore behaviour across process counts, and the
init-once / restart-many sweep pattern.
Copilot AI review requested due to automatic review settings May 21, 2026 18:22
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds documentation to help users run Basilisk MPI simulations using the common dump()/restore() checkpoint-restart workflow (including the “serial init → MPI main” pattern used in CoMPhy Lab codes), with practical HPC-oriented guidance.

Changes:

  • Adds a new “Running MPI Simulations with dump / restore” section to Tips.md.
  • Documents recommended MPI compile flags (including _GNU_SOURCE) and common failure modes.
  • Provides an HPC modules example and a suggested parameter-sweep workflow.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Tips.md
Comment on lines +211 to +216
Build the MPI executable with:

```shell
CC99='mpicc -std=c99 -D_GNU_SOURCE=1' qcc -Wall -O2 -D_MPI=1 -disable-dimensions solver.c -o solver -lm
```

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants