Do not use managed in GPU builds. Add utils. #606

olupton · 2021-07-29T14:19:41Z

Description
This fixes CPU execution of GPU builds on machines that do not have GPUs, which were previously segfaulting due to the use of the __managed__ keyword. Now the handling of Random123 state is more explicit for host/device. This was only added in #595, but the issue wasn't noticed at the time because the GPU-enabled CI only executes tests on machines with GPUs.

Also add a pair of helper functions coreneuron::[de]allocate_unified() that wrap cudaMallocManaged() in GPU builds if --gpu was passed at runtime and fall back to new/delete otherwise, and a method coreneuron::unified_memory_enabled() that queries whether this condition is met.

Additionally add a C++ allocator template coreneuron::unified_allocator<T> that wraps these functions, and a templated coreneuron::alloc_deleter<T> for use with std::unique_ptr<T, D>. Also add coreneuron::allocate_unique helper from SO.

Cleanup Random123 code by dropping an unused nrnran123_mutconstruct() method.

Tweak compilation scripts to allow for circular dependencies between libcoreneuron.a and libcudacoreneuron.a.
In future these should just be merged into a single library.

This addresses #599 (comment). #599 should stay open because various OpenACC calls are still not conditional on --gpu.

How to test this?
Try running a GPU-built special-core without --gpu on a machine that does not have an NVIDIA GPU.

Test System

OS: BB5
Compiler: NVHPC 21.7 / CUDA 11.0 / GCC 9.3
Version: master
Backend: GPU/CPU

Use certain branches for the SimulationStack CI

CI_BRANCHES:NEURON_BRANCH=master,

CMake/OpenAccHelper.cmake

bbpbuildbot · 2021-07-29T14:55:38Z

Logfiles from GitLab pipeline #12242 (:no_entry:) have been uploaded here!

Status and direct links:

bbpbuildbot · 2021-07-29T16:28:06Z

Logfiles from GitLab pipeline #12246 (:no_entry:) have been uploaded here!

Status and direct links:

olupton · 2021-07-30T14:21:00Z

This is blocked by #607.

This fixes CPU execution of GPU builds on machines that do not have GPUs, which were previously segfaulting due to the use of the __managed__ keyword. Now the handling of Random123 state is more explicit for host/device. Also add a pair of helper functions coreneuron::[de]allocate_unified() that wrap cudaMallocManaged in GPU builds if --gpu was passed at runtime and fall back to new/delete otherwise, and a method coreneuron::unified_memory_enabled() that queries whether this condition is met. Additionally add a C++ allocator template coreneuron::unified_allocator<T> that wraps these functions, a templated coreneuron::alloc_deleter<T> for use with std::unique_ptr<T, D>, and a helper coreneuron::allocate_unique(...). Cleanup Random123 code by dropping an unused nrnran123_mutconstruct method. Tweak compilation/CMake scripts to remove libcudacoreneuron.a and instead build CUDA sources inside libcoreneuron.a. This sidesteps circular dependency issues that would otherwise be introduced by this commit. Modify CMake so `clang-format` target formats CUDA (.cu) files too.

bbpbuildbot · 2021-08-13T13:49:40Z

Logfiles from GitLab pipeline #13543 (:white_check_mark:) have been uploaded here!

Status and direct links:

ferdonline

LGTM but I'm not the best person to review this code. Maybe let's wait for a review from @iomaganaris

kotsaloscv

LGTM

coreneuron/utils/randoms/nrnran123.cu

pramodk · 2021-08-16T12:04:56Z

coreneuron/utils/randoms/nrnran123.cu

    nrnran123_setseq(s, 0, 0);
    {
+        // TODO: can I assert something useful about the instance count going
+        // back to zero anywhere? Or that it is zero when some operations happen?


Answered in previous comment.

@nrnhines : How the ran123 streams allocated with nrnran123_newstream3 inside bbcore_read() should be allocated? any thoughts?

lets keep this as a separate issue. BlueBrain/nmodl#383 would help to implement this easily.

coreneuron/CMakeLists.txt

bbpbuildbot · 2021-08-16T12:24:41Z

Logfiles from GitLab pipeline #13648 (:white_check_mark:) have been uploaded here!

Status and direct links:

bbpbuildbot · 2021-08-16T14:07:35Z

Logfiles from GitLab pipeline #13680 (:white_check_mark:) have been uploaded here!

Status and direct links:

pramodk

LGTM

pramodk · 2021-08-16T14:09:12Z

coreneuron/utils/randoms/nrnran123.cu

    nrnran123_setseq(s, 0, 0);
    {
+        // TODO: can I assert something useful about the instance count going
+        // back to zero anywhere? Or that it is zero when some operations happen?


lets keep this as a separate issue. BlueBrain/nmodl#383 would help to implement this easily.

…n#606) This fixes CPU execution of GPU builds on machines that do not have GPUs, which were previously segfaulting due to the use of the __managed__ keyword. Now the handling of Random123 state is more explicit for host/device. Also add a pair of helper functions coreneuron::[de]allocate_unified() that wrap cudaMallocManaged in GPU builds if --gpu was passed at runtime and fall back to new/delete otherwise, and a method coreneuron::unified_memory_enabled() that queries whether this condition is met. Additionally add a C++ allocator template coreneuron::unified_allocator<T> that wraps these functions, a templated coreneuron::alloc_deleter<T> for use with std::unique_ptr<T, D>, and a helper coreneuron::allocate_unique(...). Cleanup Random123 code by dropping an unused nrnran123_mutconstruct method. Tweak compilation/CMake scripts to remove libcudacoreneuron.a and instead build CUDA sources inside libcoreneuron.a. This sidesteps circular dependency issues that would otherwise be introduced by this commit. Modify CMake so `clang-format` target formats CUDA (.cu) files too. CoreNEURON Repo SHA: BlueBrain/CoreNeuron@ac2fa3b

olupton requested a review from iomaganaris July 29, 2021 14:20

ferdonline reviewed Jul 29, 2021

View reviewed changes

CMake/OpenAccHelper.cmake Outdated Show resolved Hide resolved

olupton mentioned this pull request Jul 30, 2021

Random123 global state is not propagated correctly #607

Closed

olupton marked this pull request as draft July 30, 2021 14:21

olupton force-pushed the olupton/gpu-without-gpu branch from ee3feb6 to f439456 Compare August 13, 2021 12:13

olupton force-pushed the olupton/gpu-without-gpu branch from f439456 to 43b595a Compare August 13, 2021 12:51

olupton marked this pull request as ready for review August 13, 2021 12:53

Format cuda_profile.cu

ab6fd0f

ferdonline reviewed Aug 15, 2021

View reviewed changes

kotsaloscv self-requested a review August 16, 2021 07:29

olupton mentioned this pull request Aug 16, 2021

Update NVHPC warning suppressions #610

Merged

kotsaloscv approved these changes Aug 16, 2021

View reviewed changes

pramodk reviewed Aug 16, 2021

View reviewed changes

olupton added 2 commits August 16, 2021 14:42

Drop probably-redundant set_source_files_properties.

3b835ff

Print to stdout on 0th rank instead of throwing.

899fffa

olupton force-pushed the olupton/gpu-without-gpu branch from a421fa5 to 899fffa Compare August 16, 2021 12:53

pramodk approved these changes Aug 16, 2021

View reviewed changes

olupton merged commit ac2fa3b into master Aug 16, 2021

olupton deleted the olupton/gpu-without-gpu branch August 16, 2021 14:42

olupton mentioned this pull request Aug 26, 2021

Write a new allocator for coreneuron #201

Open

olupton mentioned this pull request Sep 3, 2021

NMODL generates OpenACC code that crashes on machines without GPUs. BlueBrain/nmodl#727

Open

Do not use __managed__ in GPU builds. Add utils. #606

Do not use __managed__ in GPU builds. Add utils. #606

Uh oh!

Conversation

olupton commented Jul 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

bbpbuildbot commented Jul 29, 2021

Uh oh!

bbpbuildbot commented Jul 29, 2021

Uh oh!

olupton commented Jul 30, 2021

Uh oh!

bbpbuildbot commented Aug 13, 2021

Uh oh!

ferdonline left a comment

Choose a reason for hiding this comment

Uh oh!

kotsaloscv left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pramodk Aug 16, 2021

Choose a reason for hiding this comment

Uh oh!

pramodk Aug 16, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bbpbuildbot commented Aug 16, 2021

Uh oh!

bbpbuildbot commented Aug 16, 2021

Uh oh!

pramodk left a comment

Choose a reason for hiding this comment

Uh oh!

pramodk Aug 16, 2021

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Do not use managed in GPU builds. Add utils. #606

Do not use managed in GPU builds. Add utils. #606

olupton commented Jul 29, 2021 •

edited

Loading