Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors when trying to install NEST on Saga #1435

Closed
ricardomurphy opened this issue Feb 18, 2020 · 21 comments · Fixed by #1566
Closed

Errors when trying to install NEST on Saga #1435

ricardomurphy opened this issue Feb 18, 2020 · 21 comments · Fixed by #1566
Assignees
Labels
I: No breaking change Previously written code will work as before, no one should note anything changing (aside the fix) S: High Should be handled next T: Bug Wrong statements in the code or documentation

Comments

@ricardomurphy
Copy link

Hello.
When trying to install NEST 2.20.0 on the Norwegian HPC Saga I get:
"There were errors detected during the run of the NEST test suite!"

Here's my install script:

module purge
module load GSL/2.6-GCC-8.3.0
module load Python/3.7.4-GCCcore-8.3.0
module load SciPy-bundle/2019.10-foss-2019b-Python-3.7.4
module load CMake/3.12.1
module list
cmake -DCMAKE_INSTALL_PREFIX=$HOME/nest-2.20.0
-Dwith-python=3
..
make -j
make install
make installcheck

I attach the report directory.
reports.zip

@heplesser
Copy link
Contributor

@ricardomurphy Thanks for reporting this. The failing test is

FAIL: test_nn_pre_centered_synapse (nest.tests.test_stdp_nn_synapses.STDPNNSynapsesTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/cluster/home/ricardom/nest-2.20.0/lib64/python3.7/site-packages/nest/tests/test_stdp_nn_synapses.py", line 292, in test_nn_pre_centered_synapse
    self.do_nest_simulation_and_compare_to_reproduced_weight("nn_pre-centered")
  File "/cluster/home/ricardom/nest-2.20.0/lib64/python3.7/site-packages/nest/tests/test_stdp_nn_synapses.py", line 84, in do_nest_simulation_and_compare_to_reproduced_weight
    weight_by_nest, weight_reproduced_independently))
AssertionError: 0.32057074514327416 != 0.8197400647350731 within 7 places (0.49916931959179894 difference) : stdp_nn_pre-centered_synapse test: Resulting synaptic weight 8.197401e-01 differs from expected 3.205707e-01

I cannot reproduce this failure on my laptop at the moment, but we will investigate.

Could you run the test a few times over as follows and report if you always get the same error with the same values (repeating the last line):

cd <install_dir>
source bin/nest_vars.sh
nosetests lib/python3.7/site-packages/nest/tests/test_stdp_nn_synapses.py 

@heplesser heplesser added ZC: Model DO NOT USE THIS LABEL I: No breaking change Previously written code will work as before, no one should note anything changing (aside the fix) ZP: Pending DO NOT USE THIS LABEL S: High Should be handled next T: Bug Wrong statements in the code or documentation labels Feb 18, 2020
@ricardomurphy
Copy link
Author

ricardomurphy commented Feb 19, 2020 via email

@heplesser
Copy link
Contributor

@ricardomurphy The traceback shows python2.7, while you built NEST with 3.7. It looks like you may not have loaded all the modules you loaded in your build script before running the test now. Could you try again with all modules loaded?

@ricardomurphy
Copy link
Author

ricardomurphy commented Feb 19, 2020 via email

@heplesser
Copy link
Contributor

Ok, we got exactly the same discrepancy between expected and observed in all cases, so the error is perfectly reproducible on your system.

@aserenko @clinssen @jstapmanns Could you take a look at this, since it is related to #865?

@aserenko
Copy link
Contributor

The issue does not show up on my current setup, I'll try recompiling with configuration closer to Ricardo's.

@heplesser
Copy link
Contributor

@ricardomurphy Taking another look at the modules you have loaded, I noticed that modules 3-14 are all "GCCcore-8.3.0" modules, while modules 15-20 are all "Intel-ish" modules. Does Saga offer a complete set of modules that are built with the same compiler suite, either all with GCC or all with Intel compilers? In your module loads,

module load GSL/2.6-GCC-8.3.0
module load Python/3.7.4-GCCcore-8.3.0
module load SciPy-bundle/2019.10-foss-2019b-Python-3.7.4
module load CMake/3.12.1

I also wonder if you explicitly need to load the Python/3.7.4-GCCcore-8.3.0 module? If the SciPy-bundle is configured correctly, it should load a suitable Python module automatically. Then you would only need to search for a fitting GSL module. Could you try to build and test NEST with a consistent setup?

@heplesser
Copy link
Contributor

Mirrored by HBP Support Ticket# 483077.

@ricardomurphy
Copy link
Author

ricardomurphy commented Feb 20, 2020 via email

@ricardomurphy
Copy link
Author

ricardomurphy commented Feb 21, 2020 via email

@heplesser
Copy link
Contributor

@ricardomurphy This is strange. The test in question does not depend on GSL at all, so you could just not load the GSL module and build with -Dwith-gsl=OFF. Unfortunately, the test depends on Python, so there is no easy way around the Python module.

@niltonlk
Copy link
Contributor

I got the same error when compiled without MPI. However, when I compiled it with-mpi=ON, this error dissappeard, and just to make sure, executing

nosetests lib/python3.7/site-packages/nest/tests/test_stdp_nn_synapses.py

resulted in no error.
I wonder if @ricardomurphy is trying to install without MPI support on a HPC cluster...

Below is out of the scope of this issue, however, when compiled with-mpi=ON gave the following errors:

FAIL: testWithMPI (nest.tests.test_connect_all_patterns.TestConnectAllPatterns)
", ".join(failing_tests))
AssertionError: False is not true : The following tests failed when executing with "mpirun -np 2 nosetests [script]": test_connect_all_to_all.py, test_connect_one_to_one.py, test_connect_fixed_indegree.py, test_connect_fixed_outdegree.py, test_connect_fixed_total_number.py, test_connect_pairwise_bernoulli.py
FAIL: testWithMPI (nest.tests.test_sp.test_mpitests.TestStructuralPlasticityMPI)
'[script]": {}'.format(failing_tests))
AssertionError: False is not true : The following tests failed when executing with "mpirun -np 2 nosetests [script]": ['/opt/ohpc/pub/libs/gnu8/openmpi3/nest/2.20.0/lib64/python3.6/site-packages/nest/tests/test_sp/mpitest_issue_578_sp.py']

which resulted to no error when executed with

mpirun -np 2 nosetests [script]

Which I would assume that my installation is OK.

Hope it helps and also apply for @ricardomurphy case.

@heplesser
Copy link
Contributor

@niltonlk Thank you for the information. Your observation that this issue depends on MPI is surprising, since test_stdp_nn_synapses.py does not use MPI at all. The test is not prepared to work with MPI, so I would not use mpirun -np 2 nosetest on any of the Python tests.

@niltonlk Could you post the precise CMake invocations, the configuration summary presented by CMake when it is done and your module configuration for the case that works and the case that doesn't?

@ricardomurphy Could you also post your CMake invocation and the configuration summary printed when CMake is done?

@ricardomurphy
Copy link
Author

ricardomurphy commented Mar 13, 2020 via email

@heplesser
Copy link
Contributor

@ricardomurphy Thanks! You indeed have Use MPI : No, so the test failure you observe is consistent with @niltonlk's observations. Could you try to build and test with an additional -Dwith-mpi=ON argument to Cmake?

@ricardomurphy
Copy link
Author

ricardomurphy commented Mar 13, 2020 via email

@niltonlk
Copy link
Contributor

The module configuration was the same in both cases:

module list

Currently Loaded Modules:
autotools
prun/1.3
gnu8/8.3.0
openmpi3/3.1.4
ohpc
cmake/3.15.4
gsl/2.6
fftw/3.3.8
openblas/0.3.7
python/3.6.6


CMake command and Configuration that led to the same error as @ricardomurphy

cmake -DCMAKE_INSTALL_PREFIX:PATH=/home/nilton/opt/nest/2.20.0-1435 -Dwith-python=3 -DPYTHON_EXECUTABLE=`which python3.6` -DNOSETESTS=`which nosetests` -DCYTHON_EXECUTABLE=`which cython` ../src

NEST Configuration Summary

Build type :
Target System : Linux
Cross Compiling : FALSE
C compiler : GNU 8.3.0 (/opt/ohpc/pub/compiler/gcc/8.3.0/bin/gcc)
C compiler flags : -O2 -Wall -fopenmp -fdiagnostics-color=auto
C++ compiler : GNU 8.3.0 (/opt/ohpc/pub/compiler/gcc/8.3.0/bin/c++)
C++ compiler flags : -std=c++11 -O2 -Wall -fopenmp -fdiagnostics-color=auto
Build dynamic : ON

Built-in modules : models;precise;topology
User modules : None
Python bindings : Yes (Python 3.6.8: /usr/bin/python3.6)
Includes : /usr/include/python3.6m
Libraries : /usr/lib64/libpython3.6m.so

Cython bindings : Yes (Cython 0.29.6: /opt/ohpc/pub/libs/gnu8/python/3.6.6/bin/cython)
Use threading : Yes (OpenMP: -fopenmp)
Use GSL : Yes (GSL 2.6)
Includes : /opt/ohpc/pub/libs/gnu8/gsl/2.6/include
Libraries : /opt/ohpc/pub/libs/gnu8/gsl/2.6/lib/libgsl.so;/opt/ohpc/pub/libs/gnu8/gsl/2.6/lib/libgslcblas.so

Use Readline : Yes (GNU Readline 6.2)
Includes : /usr/include
Libraries : /lib64/libreadline.so;/lib64/libncurses.so

Use libltdl : Yes (LTDL 2.4.6)
Includes : /usr/include
Libraries : /lib64/libltdl.so

Use doxygen : Yes (/usr/bin/doxygen)
: target doc available
Use MPI : No
Use MUSIC : No
Use libneurosim : No
Use Boost : No

make installcheck

FAIL: test_nn_pre_centered_synapse (nest.tests.test_stdp_nn_synapses.STDPNNSynapsesTest)
AssertionError: 0.32057074514327416 != 0.8197400647350731 within 7 places : stdp_nn_pre-centered_synapse test: Resulting synaptic weight 8.197401e-01 differs from expected 3.205707e-01
FAILED (SKIP=9, failures=1)


CMake command and Configuration with MPI

cmake -DCMAKE_INSTALL_PREFIX:PATH=/home/nilton/opt/nest/2.20.0-1435 -Dwith-python=3 -DPYTHON_EXECUTABLE=`which python3.6` -DNOSETESTS=`which nosetests` -DCYTHON_EXECUTABLE=`which cython` -Dwith-mpi=ON ../src

NEST Configuration Summary

Build type :
Target System : Linux
Cross Compiling : FALSE
C compiler : GNU 8.3.0 (/opt/ohpc/pub/compiler/gcc/8.3.0/bin/gcc)
C compiler flags : -O2 -Wall -fopenmp -pthread -fdiagnostics-color=auto
C++ compiler : GNU 8.3.0 (/opt/ohpc/pub/compiler/gcc/8.3.0/bin/c++)
C++ compiler flags : -std=c++11 -O2 -Wall -fopenmp -pthread -fdiagnostics-color=auto
Build dynamic : ON

Built-in modules : models;precise;topology
User modules : None
Python bindings : Yes (Python 3.6.8: /usr/bin/python3.6)
Includes : /usr/include/python3.6m
Libraries : /usr/lib64/libpython3.6m.so

Cython bindings : Yes (Cython 0.29.6: /opt/ohpc/pub/libs/gnu8/python/3.6.6/bin/cython)
Use threading : Yes (OpenMP: -fopenmp)
Use GSL : Yes (GSL 2.6)
Includes : /opt/ohpc/pub/libs/gnu8/gsl/2.6/include
Libraries : /opt/ohpc/pub/libs/gnu8/gsl/2.6/lib/libgsl.so;/opt/ohpc/pub/libs/gnu8/gsl/2.6/lib/libgslcblas.so

Use Readline : Yes (GNU Readline 6.2)
Includes : /usr/include
Libraries : /lib64/libreadline.so;/lib64/libncurses.so

Use libltdl : Yes (LTDL 2.4.6)
Includes : /usr/include
Libraries : /lib64/libltdl.so

Use doxygen : Yes (/usr/bin/doxygen)
: target doc available
Use MPI : Yes (MPI: /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/bin/mpicxx)
FLAGS : -pthread
Includes : /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/include
Link Flags : -Wl,-rpath -Wl,/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/lib -Wl,--enable-new-dtags -pthread
Libraries : /opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/lib/libmpi_cxx.so;/opt/ohpc/pub/mpi/openmpi3-gnu8/3.1.4/lib/libmpi.so

Use MUSIC : No
Use libneurosim : No
Use Boost : No

make installcheck

FAIL: testWithMPI (nest.tests.test_connect_all_patterns.TestConnectAllPatterns)
", ".join(failing_tests))
AssertionError: False is not true : The following tests failed when executing with "mpirun -np 2 nosetests [script]": test_connect_all_to_all.py, test_connect_one_to_one.py, test_connect_fixed_indegree.py, test_connect_fixed_outdegree.py, test_connect_fixed_total_number.py, test_connect_pairwise_bernoulli.py
FAIL: testWithMPI (nest.tests.test_sp.test_mpitests.TestStructuralPlasticityMPI)
'[script]": {}'.format(failing_tests))
AssertionError: False is not true : The following tests failed when executing with "mpirun -np 2 nosetests [script]": ['/home/nilton/opt/nest/2.20.0/lib64/python3.6/site-packages/nest/tests/test_sp/mpitest_issue_578_sp.py']
FAILED (SKIP=7, failures=2)


As I said before, the only difference was to enabling MPI

@heplesser
Copy link
Contributor

@ricardomurphy Have you made any progress on this problem? One more question: I assume you downloaded the NEST 2.20.0 source code as tarball from https://github.com/nest/nest-simulator/releases ? If we don't make progress here, we should investigate if someone from the NEST developer group can get access to Saga to be able to test directly.

@ricardomurphy
Copy link
Author

ricardomurphy commented Mar 31, 2020 via email

@heplesser
Copy link
Contributor

I have now tested on a Saga login node with

module load git/2.23.0-GCCcore-8.3.0 CMake/3.12.1 GCC/8.3.0 GSL/2.6-GCC-8.3.0 OpenMPI/3.1.4-GCC-8.3.0 Python/3.7.4-GCCcore-8.3.0 SciPy-bundle/2019.10-foss-2019b-Python-3.7.4

cmake -DCMAKE_INSTALL_PREFIX:PATH=$PWD/install -Dwith-mpi=ON ../nest-simulator-2.20.0/

which resulted in the following configuration

--------------------------------------------------------------------------------
NEST Configuration Summary
--------------------------------------------------------------------------------

Build type          : 
Target System       : Linux
Cross Compiling     : FALSE
C compiler          : GNU 8.3.0 (/cluster/software/GCCcore/8.3.0/bin/cc)
C compiler flags    :  -O2 -Wall -fopenmp    -fdiagnostics-color=auto
C++ compiler        : GNU 8.3.0 (/cluster/software/GCCcore/8.3.0/bin/c++)
C++ compiler flags  :  -std=c++11 -O2 -Wall -fopenmp  -fdiagnostics-color=auto
Build dynamic       : ON

Built-in modules    : models;precise;topology
User modules        : None
Python bindings     : Yes (Python 3.7.4: /cluster/software/Python/3.7.4-GCCcore-8.3.0/bin/python)
       Includes     : /cluster/software/Python/3.7.4-GCCcore-8.3.0/include/python3.7m
       Libraries    : /cluster/software/Python/3.7.4-GCCcore-8.3.0/lib/libpython3.7m.so

Cython bindings     : Yes (Cython 0.29.13: /cluster/software/Python/3.7.4-GCCcore-8.3.0/bin/cython)
Use threading       : Yes (OpenMP: -fopenmp)
Use GSL             : Yes (GSL 2.6)
    Includes        : /cluster/software/GSL/2.6-GCC-8.3.0/include
    Libraries       : /cluster/software/GSL/2.6-GCC-8.3.0/lib/libgsl.so;/cluster/software/GSL/2.6-GCC-8.3.0/lib/libgslcblas.so

Use Readline        : No
Use libltdl         : No
Use doxygen         : Yes (/usr/bin/doxygen)
                    : target `doc` available
    `dot` available : Yes (/usr/bin/dot)
                    : target `fulldoc` available
Use MPI             : Yes (MPI: /cluster/software/OpenMPI/3.1.4-GCC-8.3.0/bin/mpicxx)
    FLAGS           : 
    Includes        : /cluster/software/OpenMPI/3.1.4-GCC-8.3.0/include/openmpi;/cluster/software/OpenMPI/3.1.4-GCC-8.3.0/include
    Link Flags      : -Wl,-rpath -Wl,/cluster/software/OpenMPI/3.1.4-GCC-8.3.0/lib -Wl,--enable-new-dtags
    Libraries       : /cluster/software/OpenMPI/3.1.4-GCC-8.3.0/lib/libmpi_cxx.so;/cluster/software/OpenMPI/3.1.4-GCC-8.3.0/lib/libmpi.so

Use MUSIC           : No
Use libneurosim     : No
Use Boost           : No
--------------------------------------------------------------------------------

All tests pass except test_connect_all_patterns.py. The problem here appears to be related to convoluted nosetests --> mpirun -np 2 nosetests invocation rather than due to a real problem.

@heplesser heplesser self-assigned this Apr 7, 2020
@heplesser heplesser removed ZC: Model DO NOT USE THIS LABEL ZP: Pending DO NOT USE THIS LABEL labels Apr 7, 2020
@ricardomurphy
Copy link
Author

ricardomurphy commented Apr 14, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I: No breaking change Previously written code will work as before, no one should note anything changing (aside the fix) S: High Should be handled next T: Bug Wrong statements in the code or documentation
Projects
Status: Done
4 participants