-
Notifications
You must be signed in to change notification settings - Fork 374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
segmentation fault for slightly larger networks, related to version with openmpi #1881
Comments
@ChristianKeup Thanks for reporting this! This seems rather strange indeed. According to the stack trace, the segfault occurs when NEST sorts connections before starting the simulation. This does not happen (if I remember right) when not using MPI. Some questions/suggestions:
|
Hello,
Maybe it would be good to see if the issue is reproducible on another machine? |
Thanks for the update! To build the same network with SLI, use
On my computer, that works fine. Could you try on yours? My suspicion now would be that some libraries are mixed up in the Conda install. If anyone else with a Conda installation of NEST on Linux could test that would be useful. |
Ah, thanks. This code works on the current master, but on the conda installed version 2.20 it throws an error:
Could it be that in version 2.20 the Connect function took different arguments? Concerning your suspicion, this is also in line with what Moritz Helias guessed when I mentioned the issue to him. Maybe it links to a wrong version of OpenMPI or a related library. |
Thanks @AlexVanMeegen for testing the issue on his laptop. He gets the same segmentation fault error: (Conda install with MPI)
also he pointed out that maybe the very same problem was reported by Dominic Standage on the Nest user mailing list last year, the title of that mail was "Microcircuit example, segmentation fault" . |
For NEST 2.20, which doens't have the fancy node collections yet, the corresponding script is
@steffengraber Since this seems to be a conda problem, could you take a look? |
Thanks, this SLI code generated the same error as the python script. |
Can you run the script in a debugger and report the stacktrace? |
I can reproduce the issue. The Conda package is built without debugging symbols, so There are two Conda packages for Python 3.9, they depend on Boost 1.72 and 1.74. With a NEST Conda package that doesn't depend on Boost ( So this looks the Boost sorting problem that was fixed in #1502, and which is part of 2.20.1. Note that there are also packages for NEST 2.20.1, but none of them are compiled with MPI. @ChristianKeup Can you also try with a version that doesn't depend on Boost, for example |
@hakonsbm Thank you for sorting this out. |
@steffengraber this should be documented and then closed. Moving to Documentation project. |
Closing as this was an external error and is resolved by dropping MPI from conda package. |
Hello,
I'm using a laptop with 16G memory running Ubuntu 18.04. If I install NEST with OpenMPI using conda (as in the documentation):
(of course, I do not quite need openmpi on the laptop)
conda create --name ENVNAME -c conda-forge nest-simulator=*=mpi_openmpi*
and then run the following minimal example "simulation" as a python script:
I get the following output with a segmentation fault error:
[INFO] [2020.12.18 15:55:56 /home/conda/feedstock_root/build_artifacts/nest-simulator_1604245416729/work/nestkernel/rng_manager.cpp:217 @ Network::create_rngs] : Creating default RNGs
[INFO] [2020.12.18 15:55:56 /home/conda/feedstock_root/build_artifacts/nest-simulator_1604245416729/work/nestkernel/rng_manager.cpp:260 @ Network::create_grng_] : Creating new default global RNG
Copyright (C) 2004 The NEST Initiative
Version: nest-2.20.0
Built: Nov 1 2020 15:48:07
This program is provided AS IS and comes with
NO WARRANTY. See the file LICENSE for details.
Problems or suggestions?
Visit https://www.nest-simulator.org
Type 'nest.help()' to find out more about NEST.
Dec 18 15:55:56 NodeManager::prepare_nodes [Info]:
Preparing 5000 nodes for simulation.
[inm6187:15729] *** Process received signal ***
[inm6187:15729] Signal: Segmentation fault (11)
[inm6187:15729] Signal code: Address not mapped (1)
[inm6187:15729] Failing at address: 0xfffffffffffffff8
[inm6187:15729] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x12980)[0x7f9f7dbfb980]
[inm6187:15729] [ 1] /home/keup/miniconda2/envs/nesttestmpi/lib/python3.9/site-packages/nest/../../../libmodels.so(+0x4b5960)[0x7f9f5f3c0960]
[inm6187:15729] [ 2] /home/keup/miniconda2/envs/nesttestmpi/lib/python3.9/site-packages/nest/../../../libmodels.so(+0x4b5f5e)[0x7f9f5f3c0f5e]
[inm6187:15729] [ 3] /home/keup/miniconda2/envs/nesttestmpi/lib/python3.9/site-packages/nest/../../../libmodels.so(+0x4b7936)[0x7f9f5f3c2936]
[inm6187:15729] [ 4] /home/keup/miniconda2/envs/nesttestmpi/lib/python3.9/site-packages/nest/../../../libmodels.so(+0x4b8882)[0x7f9f5f3c3882]
[inm6187:15729] [ 5] /home/keup/miniconda2/envs/nesttestmpi/lib/python3.9/site-packages/nest/../../../libmodels.so(+0x4b8ae4)[0x7f9f5f3c3ae4]
[inm6187:15729] [ 6] /home/keup/miniconda2/envs/nesttestmpi/lib/python3.9/site-packages/nest/../../../libmodels.so(_ZN4nest4sortINS_6SourceENS_16StaticConnectionINS_24TargetIdentifierPtrRportEEEEEvR11BlockVectorIT_ERS5_IT0_E+0x1d9)[0x7f9f5f3c3d19]
[inm6187:15729] [ 7] /home/keup/miniconda2/envs/nesttestmpi/lib/python3.9/site-packages/nest/../../../libnestkernel.so(_ZN4nest17ConnectionManager16sort_connectionsEi+0x99)[0x7f9f5eda5c19]
[inm6187:15729] [ 8] /home/keup/miniconda2/envs/nesttestmpi/lib/python3.9/site-packages/nest/../../../libnestkernel.so(_ZN4nest17SimulationManager32update_connection_infrastructureEi+0x1f2)[0x7f9f5ed98ae2]
[inm6187:15729] [ 9] /home/keup/miniconda2/envs/nesttestmpi/lib/python3.9/site-packages/nest/../../../libgomp.so.1(GOMP_parallel+0x42)[0x7f9f5e71ee8c]
[inm6187:15729] [10] /home/keup/miniconda2/envs/nesttestmpi/lib/python3.9/site-packages/nest/../../../libnestkernel.so(_ZN4nest17SimulationManager7prepareEv+0x1be)[0x7f9f5ed97e4e]
[inm6187:15729] [11] /home/keup/miniconda2/envs/nesttestmpi/lib/python3.9/site-packages/nest/../../../libnestkernel.so(_ZN4nest17SimulationManager8simulateERKNS_4TimeE+0x12)[0x7f9f5eda3442]
[inm6187:15729] [12] /home/keup/miniconda2/envs/nesttestmpi/lib/python3.9/site-packages/nest/../../../libnestkernel.so(_ZN4nest8simulateERKd+0xc3)[0x7f9f5ed837d3]
[inm6187:15729] [13] /home/keup/miniconda2/envs/nesttestmpi/lib/python3.9/site-packages/nest/../../../libnestkernel.so(_ZNK4nest10NestModule16SimulateFunction7executeEP14SLIInterpreter+0x45)[0x7f9f5ed56b05]
[inm6187:15729] [14] /home/keup/miniconda2/envs/nesttestmpi/lib/python3.9/site-packages/nest/../../../libsli.so(+0x743b3)[0x7f9f5eaa73b3]
[inm6187:15729] [15] /home/keup/miniconda2/envs/nesttestmpi/lib/python3.9/site-packages/nest/../../../libsli.so(_ZN14SLIInterpreter8execute_Em+0x222)[0x7f9f5eaabc62]
[inm6187:15729] [16] /home/keup/miniconda2/envs/nesttestmpi/lib/python3.9/site-packages/nest/../../../libsli.so(_ZN14SLIInterpreter7executeERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x162)[0x7f9f5eaac202]
[inm6187:15729] [17] /home/keup/miniconda2/envs/nesttestmpi/lib/python3.9/site-packages/nest/pynestkernel.so(+0x2c01c)[0x7f9f5fa4c01c]
[inm6187:15729] [18] python(+0x198199)[0x55b0b4506199]
[inm6187:15729] [19] python(_PyEval_EvalFrameDefault+0x608)[0x55b0b454df88]
[inm6187:15729] [20] python(_PyFunction_Vectorcall+0x19a)[0x55b0b450bdea]
[inm6187:15729] [21] python(_PyEval_EvalFrameDefault+0x3ba)[0x55b0b454dd3a]
[inm6187:15729] [22] python(_PyFunction_Vectorcall+0x19a)[0x55b0b450bdea]
[inm6187:15729] [23] python(_PyObject_Call+0x10b)[0x55b0b44cce4b]
[inm6187:15729] [24] python(_PyEval_EvalFrameDefault+0x2eaf)[0x55b0b455082f]
[inm6187:15729] [25] python(+0x1388f0)[0x55b0b44a68f0]
[inm6187:15729] [26] python(_PyFunction_Vectorcall+0x336)[0x55b0b450bf86]
[inm6187:15729] [27] python(_PyEval_EvalFrameDefault+0x4c85)[0x55b0b4552605]
[inm6187:15729] [28] python(+0x1388f0)[0x55b0b44a68f0]
[inm6187:15729] [29] python(PyEval_EvalCodeWithName+0x47)[0x55b0b458cfd7]
[inm6187:15729] *** End of error message ***
If however, I reduce the number of neurons from 5000 to 4000, or if I conda install NEST without openmpi support, it works just fine (and then also with much larger networks, e.g. 50000 iaf_psc_delta neurons).
system/installation:
Since I don't need MPI on the laptop, the issue is not anymore a direct problem for me. Thanks @AlexVanMeegen for pointing me in the right direction. He also suggested that writing an issue could be useful.
Best, Christian
The text was updated successfully, but these errors were encountered: