score=-0.424665
components: 0 1 2 3 4 5
nodeIdx=[0,0,0] size=[310,465,930] rank=0 gpuId=0 cuda=0
nodeIdx=[1,0,0] size=[310,465,930] rank=1 gpuId=0 cuda=1
nodeIdx=[2,0,0] size=[310,465,930] rank=2 gpuId=0 cuda=2
nodeIdx=[0,1,0] size=[310,465,930] rank=3 gpuId=0 cuda=3
nodeIdx=[1,1,0] size=[310,465,930] rank=4 gpuId=0 cuda=4
nodeIdx=[2,1,0] size=[310,465,930] rank=5 gpuId=0 cuda=5
idx=[0,0,0] size=[310,465,rank=3 gpu=0 (cuda id=3) => [0,1,0]
rank=1 gpu=0 (cuda id=1) => [1,0,0]
930] rank=0 subdomain=0 cuda=0
idx=[1,0,0] size=[310,465,930] rank=1 subdomain=0 cuda=1
idx=[2,0,0] size=rank=5 gpu=0 (cuda id=5) => [2,1,0]
rank=2 gpu=0 (cuda id=2) => [2,0,0]
rank=4 gpu=0 (cuda id=4) => [1,1,0]
[310,465,930] rank=2 subdomain=0 cuda=2
idx=[0,1,0] size=[310,465,930] rank=3 subdomain=0 cuda=3
idx=[1,1,0] size=[310,465,930/ccs/home/merth/pearson/stencil/include/stencil/local_domain.cuh@54: CUDA Runtime Error(46): all CUDA-capable devices are busy or unavailable
] rank=4 subdomain=0 cuda=4
idx=[2,1,0] size=[310,465,930] rank=5 subdomain=0 cuda=5
rank=0 gpu=0 (cuda id=0) => [0,0,0]
/ccs/home/merth/pearson/stencil/include/stencil/local_domain.cuh@54: CUDA Runtime Error(46): all CUDA-capable
devices are busy or unavailable
/ccs/home/merth/pearson/stencil/include/stencil/local_domain.cuh@54: CUDA Runtime Error(46): all CUDA-capable
devices are busy or unavailable
/ccs/home/merth/pearson/stencil/include/stencil/local_domain.cuh@54: CUDA Runtime Error(46): all CUDA-capable
devices are busy or unavailable
/ccs/home/merth/pearson/stencil/include/stencil/local_domain.cuh@54: CUDA Runtime Error(46): all CUDA-capable
devices are busy or unavailable
comm plan
create remote
create colocated
create peer copy
DistributedDomain::realize: prepare peerAccessSender
/ccs/home/merth/pearson/stencil/include/stencil/rcstream.hpp@35: CUDA Runtime Error(46): all CUDA-capable devices are busy or unavailable
Running on Summit with
jsrun -n 1 -r 1 -c 42 -g 6 -a 6 -b rs js_task_info ../../build/src/weakcausesThis is possibly because all GPUs in this configuration are reported to be in
cudaComputeModeExclusiveProcess, which may only allow certain processes to access certain GPUs, even though all processes have visibility to all GPUs.It may mean that the first MPI rank that tries to
cudaSetDeviceto that GPU gets exclusive access to it.Running with only a single process on the node works:
jsrun -n 1 -r 1 -c 42 -g 6 -a 1 -b rs js_task_info ../../build/src/weak