Skip to content
This repository was archived by the owner on Mar 20, 2023. It is now read-only.

Conversation

@alkino
Copy link
Member

@alkino alkino commented Mar 24, 2020

no chkpnt line for 20
/Users/kumbhar/workarena/repos/bbp/coreneuron/coreneuron/io/nrn_filehandler.cpp:115: Assertion 'n_scan == 1' failed.
readme: line 19: 84214 Abort trap: 6           x86_64/special-core -d test${i}dat --cell-permute 1 -e 100
+ cat out.dat

Before this PR in master we get:

Error: hh is a different MOD file than used by NEURON!
Error : NEURON and CoreNEURON must use same mod files for compatibility, 1 different mod file(s) found. Re-compile special and special-core!
readme: line 19: 76753 Abort trap: 6           x86_64/special-core -d test${i}dat --cell-permute 1 -e 100

While running online mode we get:

 Memory (MBs) :          Before nrn_setup : Max 19.9883, Min 19.9883, Avg 19.9883
/Users/kumbhar/workarena/repos/bbp/nn/build_normal/install/bin/nrniv: Segmentation violation
 in testgf.hoc near line 51
 prun("")
         ^
        ParallelContext[0].nrncore_run("--tstop 10...")
      prun("")

The GDB stack trace shows:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x30)
    frame #0: 0x0000000100168645 libnrniv.dylib`nrnthread_dat2_corepointer_mech(tid=<unavailable>, type=0, icnt=0x00007ffeefbfcf80, dcnt=0x00007ffeefbfcfd8, iArray=0x00007ffeefbfcff0, dArray=<unavailable>) at nrnbbcore_write.cpp:1408:25 [opt]
   1405	      dcnt = 0;
   1406	      icnt = 0;
   1407	      // data size and allocate
-> 1408	      for (int i = 0; i < ml->nodecount; ++i) {
   1409	        (*nrn_bbcore_write_[type])(NULL, NULL, &dcnt, &icnt, ml->data[i], ml->pdata[i], ml->_thread, &nt);
   1410	      }
   1411	      dArray = NULL;
Target 0: (nrniv) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x30)
  * frame #0: 0x0000000100168645 libnrniv.dylib`nrnthread_dat2_corepointer_mech(tid=<unavailable>, type=0, icnt=0x00007ffeefbfcf80, dcnt=0x00007ffeefbfcfd8, iArray=0x00007ffeefbfcff0, dArray=<unavailable>) at nrnbbcore_write.cpp:1408:25 [opt]
    frame #1: 0x00000001039fc9ac libcoreneuron.dylib`coreneuron::Phase2::read_direct(this=0x00007ffeefbfd070, thread_id=0, nt=0x0000000100c34000) at phase2.cpp:326:9 [opt]
    frame #2: 0x00000001039f2558 libcoreneuron.dylib`void* coreneuron::coreneuron::phase_wrapper_w<(coreneuron::coreneuron::phase)2>(coreneuron::NrnThread*, coreneuron::UserParams&, bool) [inlined] coreneuron::read_phase2(nt=0x0000000100c34000, userParams=0x00007ffeefbfd320) at nrn_setup.cpp:898:12 [opt]
    frame #3: 0x00000001039f24ae libcoreneuron.dylib`void* coreneuron::coreneuron::phase_wrapper_w<(coreneuron::coreneuron::phase)2>(coreneuron::NrnThread*, coreneuron::UserParams&, bool) [inlined] void coreneuron::coreneuron::read_phase_aux<(coreneuron::coreneuron::phase)2>(nt=0x0000000100c34000, userParams=0x00007ffeefbfd320) at nrn_setup.hpp:96 [opt]
    frame #4: 0x00000001039f24ae libcoreneuron.dylib`void* coreneuron::coreneuron::phase_wrapper_w<(coreneuron::coreneuron::phase)2>(nt=0x0000000100c34000, userParams=0x00007ffeefbfd320, in_memory_transfer=<unavailable>) at nrn_setup.hpp:134 [opt]
    frame #5: 0x00000001039efe62 libcoreneuron.dylib`coreneuron::nrn_setup(char const*, bool, bool, bool, char const*, char const*, double*) [inlined] void coreneuron::nrn_multithread_job<void* (&)(coreneuron::NrnThread*, coreneuron::UserParams&, bool), coreneuron::UserParams&, bool>(args=0x0000000000000001)(coreneuron::NrnThread*, coreneuron::UserParams&, bool), coreneuron::UserParams&, bool&&) at multicore.hpp:165:9 [opt]
    frame #6: 0x00000001039efe1e libcoreneuron.dylib`coreneuron::nrn_setup(char const*, bool, bool, bool, char const*, char const*, double*) [inlined] void coreneuron::coreneuron::phase_wrapper<(coreneuron::coreneuron::phase)2>(userParams=0x0000000000000001) at nrn_setup.hpp:148 [opt]
    frame #7: 0x00000001039efe1e libcoreneuron.dylib`coreneuron::nrn_setup(filesdat=<unavailable>, is_mapping_needed=false, (null)=<unavailable>, run_setup_cleanup=true, datpath=<unavailable>, restore_path=<unavailable>, mindelay=0x0000000103a3c980) at nrn_setup.cpp:537 [opt]
    frame #8: 0x00000001039e3fb0 libcoreneuron.dylib`coreneuron::nrn_init_and_load_data(argc=4, argv=0x0000000100c30330, is_mapping_needed=false, nrnmpi_under_nrncontrol=<unavailable>, run_setup_cleanup=true) at main1.cpp:252:5 [opt]
    frame #9: 0x00000001039e4bdf libcoreneuron.dylib`::run_solve_core(argc=4, argv=0x0000000100c30330) at main1.cpp:460:9 [opt]
Using phase1 branch from Nicolas Cornu gives:
$ x86_64/special -mpi $HOC_LIBRARY_PATH/init.hoc
numprocs=1
NEURON -- VERSION 8.0.dev-101-ge529b4f+ HEAD (e529b4f+) 2020-05-26
Duke, Yale, and the BlueBrain Project -- Copyright 1984-2019
See http://neuron.yale.edu/neuron/credits

Additional mechanisms from files
 neocortex/mod/v5/ALU.mod neocortex/mod/v5/ASCIIrecord.mod neocortex/mod/v5/BinReportHelper.mod neocortex/mod/v5/BinReports.mod neocortex/mod/v5/CaDynamics_E2.mod neocortex/mod/v5/Ca_HVA.mod neocortex/mod/v5/Ca_LVAst.mod neocortex/mod/v5/Ca.mod neocortex/mod/v5/CoreConfig.mod neocortex/mod/v5/DetAMPANMDA.mod neocortex/mod/v5/DetGABAAB.mod neocortex/mod/v5/gap.mod neocortex/mod/v5/GluSynapse.mod neocortex/mod/v5/HDF5reader.mod neocortex/mod/v5/HDF5record.mod neocortex/mod/v5/Ih.mod neocortex/mod/v5/Im.mod neocortex/mod/v5/KdShu2007.mod neocortex/mod/v5/K_Pst.mod neocortex/mod/v5/K_Tst.mod neocortex/mod/v5/lookupTableV2.mod neocortex/mod/v5/memoryaccess.mod neocortex/mod/v5/MemUsage.mod neocortex/mod/v5/Nap_Et2.mod neocortex/mod/v5/NaTa_t.mod neocortex/mod/v5/NaTs2_t.mod neocortex/mod/v5/netstim_inhpoisson.mod neocortex/mod/v5/ProbAMPANMDA_EMS.mod neocortex/mod/v5/ProbGABAAB_EMS.mod neocortex/mod/v5/ProfileHelper.mod neocortex/mod/v5/SK_E2.mod neocortex/mod/v5/SKv3_1.mod neocortex/mod/v5/SpikeWriter.mod neocortex/mod/v5/StochKv.mod neocortex/mod/v5/SynapseReader.mod neocortex/mod/v5/TTXDynamicsSwitch.mod neocortex/mod/v5/utility.mod neocortex/mod/v5/VecStim.mod
[DEBUG] Memusage [MB]: Max=14.46, Min=14.46, Mean(Stdev)=14.46(0.00)
Verbose rank toggled
....
  * Target Layer23          52629 cells [2 subtargets]
  * Target Layer45          75412 cells [2 subtargets]
...
MemUsage after stdinit
[DEBUG] Memusage [MB]: Max=134.09, Min=134.09, Mean(Stdev)=134.09(0.00)
 num_mpi=1
 num_omp_thread=1


 Duke, Yale, and the BlueBrain Project -- Copyright 1984-2019
 version id unimplemented

 Additional mechanisms from files
 BinReportHelper.mod BinReports.mod Ca.mod CaDynamics_E2.mod Ca_HVA.mod Ca_LVAst.mod CoreConfig.mod DetAMPANMDA.mod DetGABAAB.mod Ih.mod Im.mod K_Pst.mod K_Tst.mod KdShu2007.mod MemUsage.mod NaTa_t.mod NaTs2_t.mod Nap_Et2.mod ProbAMPANMDA_EMS.mod ProbGABAAB_EMS.mod ProfileHelper.mod SK_E2.mod SKv3_1.mod StochKv.mod SynapseReader.mod TTXDynamicsSwitch.mod VecStim.mod gap.mod netstim_inhpoisson.mod

 Memory (MBs) :             After mk_mech : Max 190.3281, Min 190.3281, Avg 190.3281
 num_mpi=1
 num_omp_thread=1

 Memory (MBs) :            After MPI_Init : Max 190.3281, Min 190.3281, Avg 190.3281
 Memory (MBs) :          Before nrn_setup : Max 190.3281, Min 190.3281, Avg 190.3281
x86_64/special: Segmentation violation
 in /gpfs/bbp.cscs.ch/project/proj16/kumbhar/pramod_scratch/CNEUR-348/sim/../neocortex/hoc//init.hoc near line 262
 {cvode.cache_efficient(1)}
                           ^
        ParallelContext[1].nrncore_run("--tstop 10...")
      Node[0].finalizeModel()
    build_model(Node[0])
  main()

And looking at backtrace we get:

(gdb) bt
#0  0x00007fffea359d80 in __memmove_ssse3_back () from /lib64/libc.so.6
#1  0x00007fffc8e53d11 in std::__copy_move<false, true, std::random_access_iterator_tag>::__copy_m<int> (__first=0x0, __last=0x4, __result=0x8ecc7e0)
    at /gpfs/bbp.cscs.ch/ssd/apps/hpc/jenkins/deploy/compilers/2020-02-01/linux-rhel7-x86_64/gcc-4.8.5/gcc-8.3.0-4hd4baobq2/include/c++/8.3.0/bits/stl_algobase.h:368
#2  0x00007fffc8e536be in std::__copy_move_a<false, int*, int*> (__first=0x0, __last=0x4, __result=0x8ecc7e0)
    at /gpfs/bbp.cscs.ch/ssd/apps/hpc/jenkins/deploy/compilers/2020-02-01/linux-rhel7-x86_64/gcc-4.8.5/gcc-8.3.0-4hd4baobq2/include/c++/8.3.0/bits/stl_algobase.h:386
#3  0x00007fffc8e54393 in std::__copy_move_a2<false, int*, int*> (__first=0x0, __last=0x4, __result=0x8ecc7e0)
    at /gpfs/bbp.cscs.ch/ssd/apps/hpc/jenkins/deploy/compilers/2020-02-01/linux-rhel7-x86_64/gcc-4.8.5/gcc-8.3.0-4hd4baobq2/include/c++/8.3.0/bits/stl_algobase.h:422
#4  0x00007fffc8e542ff in std::copy<int*, int*> (__first=0x0, __last=0x4, __result=0x8ecc7e0)
    at /gpfs/bbp.cscs.ch/ssd/apps/hpc/jenkins/deploy/compilers/2020-02-01/linux-rhel7-x86_64/gcc-4.8.5/gcc-8.3.0-4hd4baobq2/include/c++/8.3.0/bits/stl_algobase.h:455
#5  0x00007fffc8e55e41 in coreneuron::Phase2::read_direct (this=0x7fffffffa1b0, thread_id=0, nt=...) at /gpfs/bbp.cscs.ch/project/proj16/kumbhar/pramod_scratch/CNEUR-348/coreneuron/coreneuron/io/phase2.cpp:289
#6  0x00007fffc8e3bd40 in coreneuron::read_phase2 (nt=..., userParams=...) at /gpfs/bbp.cscs.ch/project/proj16/kumbhar/pramod_scratch/CNEUR-348/coreneuron/coreneuron/io/nrn_setup.cpp:898
#7  0x00007fffc8e3cc07 in coreneuron::coreneuron::read_phase_aux<(coreneuron::coreneuron::phase)2> (nt=..., userParams=...) at /gpfs/bbp.cscs.ch/project/proj16/kumbhar/pramod_scratch/CNEUR-348/coreneuron/build/include/coreneuron/io/nrn_setup.hpp:96
#8  0x00007fffc8e41186 in coreneuron::coreneuron::phase_wrapper_w<(coreneuron::coreneuron::phase)2> (nt=0x8f3a30, userParams=..., in_memory_transfer=true)
    at /gpfs/bbp.cscs.ch/project/proj16/kumbhar/pramod_scratch/CNEUR-348/coreneuron/build/include/coreneuron/io/nrn_setup.hpp:134
#9  0x00007fffc8e47a56 in coreneuron::_ZN10coreneuron19nrn_multithread_jobIRFPvPNS_9NrnThreadERNS_10UserParamsEbEJS5_bEEEvOT_DpOT0_._omp_fn.0(void) ()
    at /gpfs/bbp.cscs.ch/project/proj16/kumbhar/pramod_scratch/CNEUR-348/coreneuron/build/include/coreneuron/sim/multicore.hpp:165
#10 0x00007fffd1f3ead8 in __kmp_api_GOMP_parallel (task=0x8ecc7e0, data=0x0, num_threads=3929382272, flags=0) at ../../src/kmp_gsupport.cpp:1449
#11 0x00007fffc8e40bb8 in coreneuron::nrn_multithread_job<void* (&)(coreneuron::NrnThread*, coreneuron::UserParams&, bool), coreneuron::UserParams&, bool> (job=
    @0x7fffc8e40f31: {void *(coreneuron::NrnThread *, coreneuron::UserParams &, bool)} 0x7fffc8e40f31 <coreneuron::coreneuron::phase_wrapper_w<(coreneuron::coreneuron::phase)2>(coreneuron::NrnThread*, coreneuron::UserParams&, bool)>, args#0=...,
    args#1=@0x7fffffffa5cf: true) at /gpfs/bbp.cscs.ch/project/proj16/kumbhar/pramod_scratch/CNEUR-348/coreneuron/build/include/coreneuron/sim/multicore.hpp:162
#12 0x00007fffc8e3c41d in coreneuron::coreneuron::phase_wrapper<(coreneuron::coreneuron::phase)2> (userParams=..., direct=1) at /gpfs/bbp.cscs.ch/project/proj16/kumbhar/pramod_scratch/CNEUR-348/coreneuron/build/include/coreneuron/io/nrn_setup.hpp:148
#13 0x00007fffc8e3abae in coreneuron::nrn_setup (filesdat=0x7fffffffa710 "./files.dat", is_mapping_needed=false, run_setup_cleanup=true, datpath=0x7fffc9130d68 <coreneuron::corenrn_param+168> ".", restore_path=0x7fffffffa730 "",
    mindelay=0x7fffc9130d30 <coreneuron::corenrn_param+112>) at /gpfs/bbp.cscs.ch/project/proj16/kumbhar/pramod_scratch/CNEUR-348/coreneuron/coreneuron/io/nrn_setup.cpp:537
#14 0x00007fffc8e27bf1 in coreneuron::nrn_init_and_load_data (argc=5, argv=0x8b9e610, is_mapping_needed=false, nrnmpi_under_nrncontrol=true, run_setup_cleanup=true)
    at /gpfs/bbp.cscs.ch/project/proj16/kumbhar/pramod_scratch/CNEUR-348/coreneuron/coreneuron/apps/main1.cpp:256
#15 0x00007fffc8e28780 in run_solve_core (argc=5, argv=0x8b9e610) at /gpfs/bbp.cscs.ch/project/proj16/kumbhar/pramod_scratch/CNEUR-348/coreneuron/coreneuron/apps/main1.cpp:481

fixed in b2c4400 : artificial cells don't have nodeindices.

  • Run olfactory 3d-bulb model with this PR

This models works in file transfer mode but with in-memory transfer we get:

total setup time  31.25
mindelay = 1
cvode active= False
t=0 wall interval 0.43
 num_mpi=2


 Duke, Yale, and the BlueBrain Project -- Copyright 1984-2019
 version id unimplemented

 Additional mechanisms from files
 Gfluct.mod GluSynapse.mod ThreshDetect.mod ampanmda.mod distrt.mod fi.mod fi_stdp.mod gap.mod kamt.mod kdrmt.mod ks.mod naxn.mod orn.mod ostimhelper.mod

 Memory (MBs) :             After mk_mech : Max 332.3320, Min 330.1172, Avg 331.2246
 Memory (MBs) :            After MPI_Init : Max 332.3320, Min 330.1172, Avg 331.2246
 Memory (MBs) :          Before nrn_setup : Max 332.3633, Min 330.1445, Avg 331.2539
[bluebrain355:28861] *** Process received signal ***
[bluebrain355:28861] Signal: Segmentation fault: 11 (11)
[bluebrain355:28861] Signal code: Address not mapped (1)
[bluebrain355:28861] Failing at address: 0x0
[bluebrain355:28861] [ 0] 0   libsystem_platform.dylib            0x00007fff70a575fd _sigtramp + 29
[bluebrain355:28861] [ 1] 0   libsystem_c.dylib                   0x00007fff708e96e5 __sfvwrite + 402
[bluebrain355:28861] [ 2] 0   libcoreneuron.dylib                 0x000000010e827f66 _ZN10coreneuron6Phase211read_directEiRKNS_9NrnThreadE + 1734
[bluebrain355:28861] [ 3] 0   libcoreneuron.dylib                 0x000000010e81e0d8 _ZN10coreneuron10coreneuron15phase_wrapper_wILNS0_5phaseE2EEEPvPNS_9NrnThreadERNS_10UserParamsEb + 216

false alarm ! - also fixed by b2c4400!

  • Save Restore testing with 3-d bulb model

Currently we are getting:

 Memory (MBs) :            After MPI_Init : Max 3.2031, Min 3.2031, Avg 3.2031
 Memory (MBs) :          Before nrn_setup : Max 3.2461, Min 3.2461, Avg 3.2461
Model uses gap junctions
 ==>  8602 0 0
libcorenrnmech.dylib was compiled with optimization - stepping may behave oddly; variables may not be available.
Process 50945 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x0000000100117d71 libcorenrnmech.dylib`bbcore_read(x=0x0000000000000000, d=0x0000000000000000, xx=0x00007ffeefbfd82c, offset=0x00007ffeefbfd7c8, _iml=0, _cntml_padded=40, _p=0x00000001045d1340, _ppvar=0x000000010200ad40, _thread=0x0000000000000000, _nt=0x0000000100705d50, v=0) at orn.cpp:801:29 [opt]
   798 		assert(!_p_donotuse);
   799 		uint32_t* di = ((uint32_t*)d) + *offset;
   800 		nrnran123_State** pv = (nrnran123_State**)(&_p_donotuse);
-> 801 		*pv = nrnran123_newstream3(di[0], di[1], di[2]);
   802 		*offset += 3;
   803 	}
   804 	 namespace coreneuron {
Target 0: (special-core) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x0000000100117d71 libcorenrnmech.dylib`bbcore_read(x=0x0000000000000000, d=0x0000000000000000, xx=0x00007ffeefbfd82c, offset=0x00007ffeefbfd7c8, _iml=0, _cntml_padded=40, _p=0x00000001045d1340, _ppvar=0x000000010200ad40, _thread=0x0000000000000000, _nt=0x0000000100705d50, v=0) at orn.cpp:801:29 [opt]
    frame #1: 0x000000010017a0d5 libcoreneuron.dylib`coreneuron::read_phase2(F=<unavailable>, imult=<unavailable>, nt=<unavailable>) at nrn_setup.cpp:2016:13 [opt]
    frame #2: 0x00000001001773f5 libcoreneuron.dylib`void* coreneuron::coreneuron::phase_wrapper_w<(coreneuron::coreneuron::phase)2>(coreneuron::NrnThread*) [inlined] void coreneuron::coreneuron::read_phase_aux<(coreneuron::coreneuron::phase)2>(F=<unavailable>, imult=<unavailable>, nt=<unavailable>) at nrn_setup.hpp:101:5 [opt]
    frame #3: 0x00000001001773ed libcoreneuron.dylib`void* coreneuron::coreneuron::phase_wrapper_w<(coreneuron::coreneuron::phase)2>(nt=0x0000000100705d50) at nrn_setup.hpp:141 [opt]
    frame #4: 0x000000010017423b libcoreneuron.dylib`coreneuron::nrn_setup(char const*, bool, bool, bool, char const*, char const*, double*) [inlined] void coreneuron::nrn_multithread_job<void* (&)(coreneuron::NrnThread*)>(void* (&)(coreneuron::NrnThread*)) at multicore.hpp:165:9 [opt]
    frame #5: 0x000000010017421b libcoreneuron.dylib`coreneuron::nrn_setup(char const*, bool, bool, bool, char const*, char const*, double*) [inlined] void coreneuron::coreneuron::phase_wrapper<(coreneuron::coreneuron::phase)2>(int) at nrn_setup.hpp:158 [opt]
    frame #6: 0x000000010017421b libcoreneuron.dylib`coreneuron::nrn_setup(filesdat=<unavailable>, is_mapping_needed=false, byte_swap=false, run_setup_cleanup=true, datpath="dataset/", restore_path="ouput4/", mindelay=0x00000001001bd8b0) at nrn_setup.cpp:777 [opt]
    frame #7: 0x000000010016750e libcoreneuron.dylib`coreneuron::nrn_init_and_load_data(argc=<unavailable>, argv=<unavailable>, is_mapping_needed=false, run_setup_cleanup=true) at main1.cpp:250:5 [opt]
    frame #8: 0x0000000100168264 libcoreneuron.dylib`::run_solve_core(argc=<unavailable>, argv=<unavailable>) at main1.cpp:485:9 [opt]
    frame #9: 0x000000010010a790 libcorenrnmech.dylib`::solve_core(argc=11, argv=0x00007ffeefbfe418) at enginemech.cpp:42:15 [opt]
    frame #10: 0x00007fff7085ecc9 libdyld.dylib`start + 1
    frame #11: 0x00007fff7085ecc9 libdyld.dylib`start + 1
(lldb) f 1
libcoreneuron.dylib was compiled with optimization - stepping may behave oddly; variables may not be available.
frame #1: 0x000000010017a0d5 libcoreneuron.dylib`coreneuron::read_phase2(F=<unavailable>, imult=<unavailable>, nt=<unavailable>) at nrn_setup.cpp:2016:13 [opt]
   2013	            pd += nrn_i_layout(jp, cntml, 0, pdsz, layout);
   2014	            int aln_cntml = nrn_soa_padded_size(cntml, layout);
   2015	            std::cout << " ==> " << " " << nt.ncell << " "  << icnt << " " << dcnt << " \n";
-> 2016	            (*corenrn.get_bbcore_read()[type])(dArray, iArray, &dk, &ik, 0, aln_cntml, d, pd, ml->_thread,

Issue was that mod files had bbcore_write function body commented and hence bbcore iarray / darray were not correct : https://github.com/pramodk/modeldb-bulb3d-sim/pull/8/files

→ git show HEAD
commit 5b3265ec7d52306607544f2eb50ff76d25c03886 (HEAD -> tmp, origin/pramodk/bbcore_write)
Author: Pramod Kumbhar <[email protected]>
Date:   Fri Aug 7 19:07:52 2020 +0200

    bbcore_write are updated to support checkpoint-restart with CoreNEURON

diff --git a/sim/orn.mod b/sim/orn.mod
index 6b8057a..c662cb5 100755
--- a/sim/orn.mod
+++ b/sim/orn.mod
@@ -322,8 +322,8 @@ static void bbcore_write(double* x, int* d, int* xx, int *offset, _threadargspro
                }
                /*printf("orn bbcore_write %d %d %d\n", di[0], di[1], di[3]);*/
        }
-       *offset += 3;
 #endif
+       *offset += 3;
 }

 static void bbcore_read(double* x, int* d, int* xx, int* offset, _threadargsproto_) {
diff --git a/sim/ostimhelper.mod b/sim/ostimhelper.mod
index 8b929f8..8ce1ad2 100644
--- a/sim/ostimhelper.mod
+++ b/sim/ostimhelper.mod
@@ -90,8 +90,8 @@ static void bbcore_write(double* x, int* d, int* xx, int *offset, _threadargspro
     nrnran123_State** pv = (nrnran123_State**)(&_p_space);
     nrnran123_getids3(*pv, di, di+1, di+2);
   }
-  *offset += 3;
 #endif
+  *offset += 3;
 }
  • Check memory usage from this PR vs current master
  • bulb3d model:
# master
 Start time (t) = 0

 Memory (MBs) :  After mk_spikevec_buffer : Max 93.3828, Min 93.3828, Avg 93.3828
 Memory (MBs) :     After nrn_finitialize : Max 94.9531, Min 94.9531, Avg 94.9531

 psolve |========================================================| t: 7.97   ETA: 0h00m04s

Solver Time : 4.49385


 Simulation Statistics
 Number of cells: 17057
 Number of compartments: 197648
 Number of presyns: 51143
 Number of input presyns: 0
 Number of synapses: 34159
 Number of point processes: 68455
 Number of transfer (gap) targets: 210
 Number of spikes: 16801
 Number of spikes with non negative gid-s: 16801

# this PR

 Start time (t) = 0

 Memory (MBs) :  After mk_spikevec_buffer : Max 104.8750, Min 104.8750, Avg 104.8750
 Memory (MBs) :     After nrn_finitialize : Max 107.6914, Min 107.6914, Avg 107.6914

 psolve |========================================================| t: 7.97   ETA: 0h00m04s

Solver Time : 4.47372


 Simulation Statistics
 Number of cells: 17057
 Number of compartments: 197648
 Number of presyns: 51143
 Number of input presyns: 0
 Number of synapses: 34159
 Number of point processes: 68455
 Number of transfer (gap) targets: 210
 Number of spikes: 16801
 Number of spikes with non negative gid-s: 16801
  • BBP Model
# master

 Memory (MBs) :  After mk_spikevec_buffer : Max 1075.3008, Min 879.0938, Avg 995.6172

 WARNING: nrn_nrn_wrote_conc support on GPU need to validate!
 Memory (MBs) :     After nrn_finitialize : Max 1075.3008, Min 879.0938, Avg 995.6172
2 reports; sync time: 0.000124; sharing time : 0.001760
open file for header creation
open file for header creation
allreports create time : 0.015181
[REPORTS] [info] :: Initializing PARALLEL implementation...

 psolve |========================================================| t: 50.00  ETA: 0h00m21s

Solver Time : 20.8226


 Simulation Statistics
 Number of cells: 1010
 Number of compartments: 453442
 Number of presyns: 1933156
 Number of input presyns: 340791
 Number of synapses: 3864292
 Number of point processes: 3866392
 Number of transfer (gap) targets: 0
 Number of spikes: 356
 Number of spikes with non negative gid-s: 356

# This PR

 Memory (MBs) :  After mk_spikevec_buffer : Max 1303.1797, Min 1058.8477, Avg 1203.6133

 WARNING: nrn_nrn_wrote_conc support on GPU need to validate!
 Memory (MBs) :     After nrn_finitialize : Max 1303.1797, Min 1058.8477, Avg 1203.6450

 psolve |========================================================| t: 50.00  ETA: 0h01m08s

Solver Time : 68.3404  # Ignore timing because of GCC


 Simulation Statistics
 Number of cells: 1010
 Number of compartments: 453442
 Number of presyns: 1933156
 Number of input presyns: 340791
 Number of synapses: 3864292
 Number of point processes: 3866376
 Number of transfer (gap) targets: 0
 Number of spikes: 356
 Number of spikes with non negative gid-s: 356

~@alkino : intermediate vector shows 15% increase in the memory usage with the intermediate copies (?). Those vectors get removed after setup phase but we will need a strategy to reduce this. I will create separate ticket and not block this PR.

I WAS WRONG! I was comparing file based transfer with master vs. in-memory transfer with this PR. Obviously in-memory transfer has more overhead. Apples-to-apples comparison (after disabling reports because it was disabled in later case), with file transfer mode I get:

#master 

OUTPUT PARAMETERS
--dt_io=0.1
--outpath=/gpfs/bbp.cscs.ch/project/proj16/kumbhar/pramod_scratch/CNEUR-348/output/
--checkpoint=output1/checkpoint

 Start time (t) = 0

 Memory (MBs) :  After mk_spikevec_buffer : Max 1075.0430, Min 879.1836, Avg 994.8757

 WARNING: nrn_nrn_wrote_conc support on GPU need to validate!
 Memory (MBs) :     After nrn_finitialize : Max 1075.0430, Min 879.1836, Avg 994.8757
[REPORTS] [info] :: Initializing PARALLEL implementation...

 psolve |========================================================| t: 50.00  ETA: 0h00m20s

Solver Time : 20.1608

# this PR

 Memory (MBs) :  After mk_spikevec_buffer : Max 1080.7656, Min 884.9492, Avg 1000.7869

 WARNING: nrn_nrn_wrote_conc support on GPU need to validate!
 Memory (MBs) :     After nrn_finitialize : Max 1080.7656, Min 884.9492, Avg 1000.7869

 psolve |========================================================| t: 50.00  ETA: 0h01m09s

Solver Time : 68.8981 #ignore this because of GCC

So almost similar memory usage. All good!

@bbpbuildbot
Copy link
Collaborator

Can one of the admins verify this patch?

Copy link
Collaborator

@pramodk pramodk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall separation looks good. Once you are done with major changes, let me know and I will start reviewing the code details.

@alkino alkino changed the title Rewrite phase1 Split nrn_setup.cpp in phase1 / phase2 Apr 6, 2020
@alkino
Copy link
Member Author

alkino commented Apr 9, 2020

Please retest

ohm314
ohm314 previously requested changes Apr 14, 2020
Copy link
Contributor

@ohm314 ohm314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! I think we are definitely on the right track.

Copy link
Collaborator

@pramodk pramodk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review part I : very nice @alkino! I covered everything except phase2. I have some minor comments that you can already take a look at.

I will review phase2 monster tomorrow!

Copy link
Collaborator

@pramodk pramodk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrok in progress, sill reviewing two files

}

void Phase1::shift_gids(int imult) {
int zz = imult * maxgid; // offset for each gid
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what we are doing is : let say we have a model with XX cells then when we duplicate it by factor if 5 then we make new max did 5 x XX. So simply, new_maxgid would be fine?

Copy link
Collaborator

@pramodk pramodk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second last

@pramodk
Copy link
Collaborator

pramodk commented Apr 22, 2020

Yeah, with the suggestions I provided modifications at one place only and did not change all usages.

You can go conversation one by one, apply change locally on your computer (instead of github) and then resolve conversation.

@ohm314
Copy link
Contributor

ohm314 commented Apr 22, 2020

I agree with @pramodk 's general point that the small nitpicky changes should be applied. They don't change your overall code structure and are easy to apply (just annoying). This is also true for my small nitpicky changes and my request to add documentation to the various functions you touched.

Copy link
Collaborator

@pramodk pramodk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My major comment is about memory duplication, see comments.

@alkino
Copy link
Member Author

alkino commented May 2, 2020

Please retest

@pramodk
Copy link
Collaborator

pramodk commented May 2, 2020

@alkino : I see that GPU build causing all failures. Without PGI modules we can’t build that. Can you disable GPU builds for time being?

@alkino
Copy link
Member Author

alkino commented May 5, 2020

Please retest

@alkino alkino force-pushed the phase1 branch 2 times, most recently from 0c29771 to 5bd8c5c Compare May 5, 2020 13:51
@pramodk
Copy link
Collaborator

pramodk commented May 7, 2020

Please retest

1 similar comment
@pramodk
Copy link
Collaborator

pramodk commented May 18, 2020

Please retest

@pramodk pramodk closed this Aug 9, 2020
@pramodk pramodk reopened this Aug 9, 2020
@pramodk pramodk merged commit f0acbda into master Aug 9, 2020
@pramodk pramodk deleted the phase1 branch August 9, 2020 21:48
@pramodk pramodk mentioned this pull request Aug 9, 2020
18 tasks
@alkino alkino mentioned this pull request Oct 22, 2020
pramodk added a commit that referenced this pull request Mar 3, 2021
 - acc_memcpy_to_device is used to copy copy the device pointer
 - so we have use address of device pointer and sizeof(IvocVect*)
   as the size to copy the pointer. Note that IvocVect is already
   copied by previous acc_copyin call
 - this issue was introduced in #283

fixes #501
pramodk added a commit that referenced this pull request Mar 4, 2021
- acc_memcpy_to_device is used to copy the device pointer
- so we have to use an address of the device pointer and size of 
   pointer ie. `sizeof(IvocVect*)`
- note that IvocVect is already copied by previous acc_copyin call
- this issue was introduced in #283

fixes #501
pramodk pushed a commit to neuronsimulator/nrn that referenced this pull request Nov 2, 2022
…cpp) (BlueBrain/CoreNeuron#283)

Major refactoring of monolithic nrn_setup.cpp into two separate phases that allows to
re-use the code for file based transfer as well as memory transfer:
 * Rewrite phase1, phase2 and move into separate .cpp files
 * Remove global static user variables
 * Add OMP_Mutex wrapper for handling OpenMP mutex
 * VecPlay now use std::vector instead of raw pointers
     - Give ownership of y_, t_ to VecPlayContinuous
 * Remove lock / unlock from fixed_vector as it was not used 
 * Remove things related to endianess : code as well as tests
 * Use name of phase instead of numbers with template arguments
* Check mechanism compatibility before reading data
   - if mechanism / mod file is different, we can't read data
     because sizes between neuron and coreneuron will be different.
     This will result into segfault / memory error.
  - check mechanism compatibility right after mechanism information
     is available

Co-authored-by: Omar Awile <[email protected]>
Co-authored-by: pramodk <[email protected]>

Thanks @alkino for major work!

CoreNEURON Repo SHA: BlueBrain/CoreNeuron@f0acbda
pramodk added a commit to neuronsimulator/nrn that referenced this pull request Nov 2, 2022
…euron#502)

- acc_memcpy_to_device is used to copy the device pointer
- so we have to use an address of the device pointer and size of 
   pointer ie. `sizeof(IvocVect*)`
- note that IvocVect is already copied by previous acc_copyin call
- this issue was introduced in BlueBrain/CoreNeuron#283

fixes BlueBrain/CoreNeuron#501

CoreNEURON Repo SHA: BlueBrain/CoreNeuron@b8d9fcd
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants