Modernise CMake/CUDA and fix link issues #609

olupton · 2021-08-12T11:24:23Z

Description
This changeset changes the CMake configuration of GPU builds to:

Use CMake's language support for CUDA, instead of the deprecated find_package(CUDA ...), cuda_add_library(...), etc.
Increase the minimum CMake version for GPU builds to v3.15, with an explicit error message if this is not respected.
Retire the custom CORENRN_GPU_CUDA_COMPUTE_CAPABILITY option; we now use the standard CMAKE_CUDA_ARCHITECTURES variable
Avoid multiple device code linking steps, which apparently caused problems (Random123 global state is not propagated correctly #607) with global state.
Prefix more preprocessor macro names (Prefix project macros to prevent conflicts #153)
Bump the https://github.com/BlueBrain/hpc-coding-conventions submodule commit to support clang-format-12.
Fix a silly error in Use CUDA unified memory for Random123 state #595 affecting builds with asserts disabled.

These changes force compilation to proceed slightly differently to a standard mixed CUDA/C++ project. Instead of allowing CMake to generate an explicit device linker step, i.e.

nvcc -dc -o foo.cu.o foo.cu ...
nvcc -dlink -o dlink.o *.cu.o ...
ar qc libcudastuff.a *.cu.o dlink.o

before compiling OpenACC/GPU-enabled C++ code (which emits additional device code):

nvc++ -acc -gpu=cuda11.0,cc70 -o main main.cpp -lcudastuff ...

we instead let nvc++ do the device code linking itself, i.e. something more like

nvc++ -acc -gpu=cuda11.0,cc70 -o main.cpp.o -c main.cpp
nvc++ -acc -gpu=cuda11.0,cc70 -cuda -o main main.cpp.o *.cu.o # -cuda but no dlink.o!

which seems to give correct results. This is a bit tortuous in CMake presumably because the same pattern would fail with a GPU-unaware C++ compiler, e.g. GCC or Clang. The hypothesis is that the old way of doing things fell foul of

It is possible to do multiple device links within a single host executable, as long as each device link is independent of the other. This requirement of independence means that they cannot share code across device executables, nor can they share addresses (e.g., a device function address can be passed from host to device for a callback only if the device link sees both the caller and potential callback callee; you cannot pass an address from one device executable to another, as those are separate address spaces).

from the CUDA documentation, while the new way ensures there is a single device link step including both the code generated from .cu files and that from OpenACC regions.

We now also prefer to dynamically link the CUDA runtime, libcudart.so. nvc++ -cuda seems to prefer this and only allows it to be steered by the -static-nvidia option, which would also statically link the OpenACC runtime (which has always been dynamically linked). Setting CMAKE_CUDA_RUNTIME_LIBRARY=Shared stops CMake from emitting -lcudart_static, which causes segfaults at teardown in combination with -cuda's dynamic linking.

cc: @kotsaloscv

Closes #520. Closes #607.

How to test this?
Follow instructions in #607 to test the link/global state issue.

Test System

OS: BB5
Compiler: NVHPC 21.7 / CUDA 11.0
Version: master
Backend: GPU

Use certain branches for the SimulationStack CI

CI_BRANCHES:NEURON_BRANCH=master,

Make sure device code linking only happens once, rather than linking explicit CUDA code earlier and linking OpenACC device code later. CUDA has first class language support in all recent CMake versions, so remove find_package(CUDA) and retire the deprecated `cuda_add_library` function. Retire the CORENRN_GPU_CUDA_COMPUTE_CAPABILITY CMake variable and use the standard CMAKE_CUDA_ARCHTECTURES one instead.

This avoids __CUDACC__ and CUDA features being enabled when compiling .cpp files, which breaks assumptions (but might be fine in the long run). Also prefix some preprocessor macro names with CORENEURON_.

bbpbuildbot · 2021-08-12T11:52:45Z

Logfiles from GitLab pipeline #13365 (:no_entry:) have been uploaded here!

Status and direct links:

bbpbuildbot · 2021-08-12T12:52:08Z

Logfiles from GitLab pipeline #13373 (:no_entry:) have been uploaded here!

Status and direct links:

bbpbuildbot · 2021-08-12T14:13:45Z

Logfiles from GitLab pipeline #13388 (:no_entry:) have been uploaded here!

Status and direct links:

bbpbuildbot · 2021-08-12T14:15:04Z

Logfiles from GitLab pipeline #13387 (:no_entry:) have been uploaded here!

Status and direct links:

Drop other -D argument that is now injected via the `localrc` file of BB5's PGI/NVHPC compiler installations.

bbpbuildbot · 2021-08-12T16:05:14Z

Logfiles from GitLab pipeline #13408 (:white_check_mark:) have been uploaded here!

Status and direct links:

bbpbuildbot · 2021-08-12T17:07:28Z

Logfiles from GitLab pipeline #13416 (:white_check_mark:) have been uploaded here!

Status and direct links:

olupton · 2021-08-13T08:35:55Z

We should merge BlueBrain/spack#1249 immediately before merging this.

alexsavulescu · 2021-08-13T09:33:57Z

Please retest

* clang-format-12 support in hpc-coding-conventions. * Fix CUDA/GPU linking and avoid deprecated CMake. Make sure device code linking only happens once, rather than linking explicit CUDA code earlier and linking OpenACC device code later. CUDA has first class language support in all recent CMake versions, so remove find_package(CUDA) and retire the deprecated `cuda_add_library` function. Retire the CORENRN_GPU_CUDA_COMPUTE_CAPABILITY CMake variable and use the standard CMAKE_CUDA_ARCHTECTURES one instead. * Only pass -cuda when linking. This avoids __CUDACC__ and CUDA features being enabled when compiling .cpp files, which breaks assumptions (but might be fine in the long run). Also prefix some preprocessor macro names with CORENEURON_. * Consistent libcudart linkage. * Fix silly error for -DNDEBUG. * Update README to suggest CMAKE_CUDA_COMPILER=nvcc. * CMake minimum v3.15 for GPU builds. * Tweaks for older CMake versions. * Set CMAKE_CUDA_COMPILER=nvcc. Drop other -D argument that is now injected via the `localrc` file of BB5's PGI/NVHPC compiler installations. CoreNEURON Repo SHA: BlueBrain/CoreNeuron@170a0bb

olupton added 6 commits August 11, 2021 15:35

clang-format-12 support in hpc-coding-conventions.

1d7a1e1

Only pass -cuda when linking.

2ef82f8

This avoids __CUDACC__ and CUDA features being enabled when compiling .cpp files, which breaks assumptions (but might be fine in the long run). Also prefix some preprocessor macro names with CORENEURON_.

Consistent libcudart linkage.

046fa39

Fix silly error for -DNDEBUG.

dbc7135

Update README to suggest CMAKE_CUDA_COMPILER=nvcc.

b520ace

CMake minimum v3.15 for GPU builds.

8e6ff6b

olupton force-pushed the olupton/modernise-cuda branch from 505b7cd to 8e6ff6b Compare August 12, 2021 12:22

Tweaks for older CMake versions.

6a51b37

olupton closed this Aug 12, 2021

olupton reopened this Aug 12, 2021

Set CMAKE_CUDA_COMPILER=nvcc.

6ffff59

Drop other -D argument that is now injected via the `localrc` file of BB5's PGI/NVHPC compiler installations.

olupton mentioned this pull request Aug 13, 2021

Follow CMake/CUDA changes in CoreNEURON BlueBrain/spack#1249

Merged

olupton marked this pull request as ready for review August 13, 2021 07:46

olupton mentioned this pull request Aug 13, 2021

CoreNEURON uses deprecated FindCUDA #520

Closed

alexsavulescu approved these changes Aug 13, 2021

View reviewed changes

alexsavulescu merged commit 170a0bb into master Aug 13, 2021

alexsavulescu deleted the olupton/modernise-cuda branch August 13, 2021 11:14

olupton mentioned this pull request Aug 13, 2021

Do not use __managed__ in GPU builds. Add utils. #606

Merged

This was referenced Aug 18, 2021

Bump minimum supported CMake version to 3.15 #612

Closed

Bump minimum supported CMake version to 3.15 #613

Merged

olupton mentioned this pull request Sep 3, 2021

NMODL generates OpenACC code that crashes on machines without GPUs. BlueBrain/nmodl#727

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Modernise CMake/CUDA and fix link issues #609

Modernise CMake/CUDA and fix link issues #609

Uh oh!

olupton commented Aug 12, 2021 •

edited by alexsavulescu

Loading

Uh oh!

bbpbuildbot commented Aug 12, 2021

Uh oh!

bbpbuildbot commented Aug 12, 2021

Uh oh!

bbpbuildbot commented Aug 12, 2021

Uh oh!

bbpbuildbot commented Aug 12, 2021

Uh oh!

bbpbuildbot commented Aug 12, 2021

Uh oh!

bbpbuildbot commented Aug 12, 2021

Uh oh!

olupton commented Aug 13, 2021

Uh oh!

alexsavulescu commented Aug 13, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Modernise CMake/CUDA and fix link issues #609

Modernise CMake/CUDA and fix link issues #609

Uh oh!

Conversation

olupton commented Aug 12, 2021 • edited by alexsavulescu Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bbpbuildbot commented Aug 12, 2021

Uh oh!

bbpbuildbot commented Aug 12, 2021

Uh oh!

bbpbuildbot commented Aug 12, 2021

Uh oh!

bbpbuildbot commented Aug 12, 2021

Uh oh!

bbpbuildbot commented Aug 12, 2021

Uh oh!

bbpbuildbot commented Aug 12, 2021

Uh oh!

olupton commented Aug 13, 2021

Uh oh!

alexsavulescu commented Aug 13, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

olupton commented Aug 12, 2021 •

edited by alexsavulescu

Loading

alexsavulescu commented Aug 13, 2021 •

edited

Loading