Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve BlockVector to fix issues when sorting with recent versions of Boost #1502

Merged
merged 13 commits into from
Apr 23, 2020

Conversation

hakonsbm
Copy link
Contributor

@hakonsbm hakonsbm commented Apr 6, 2020

Until Boost version 1.69.0, spreadsort used std::sort for small lists (<1000 elements). But with 1.69.0 and onward it uses pdqsort from the Boost library instead. Because the pdqsort algorithm uses some operators that neither std::sort nor spreadsort uses, and because of inadequacies in the BlockVector implementation, this caused lists to not be sorted.

When sorting with spreadsort from the Boost library, iterators for the BlockVector to be sorted and the BlockVector on which the same operations are performed, are combined. The combination is implemented with the IteratorPair iterator, which uses iterator_facade from the Boost library. The iterator_facade uses a few core functions to infer operators of the iterator.

The problem arose when pdqsort used the operator operator-=(). In IteratorPair, that operator is inferred from the function

advance( n ) 
{ 
  sort_iter_ += n; 
  perm_iter_ += n; 
} 

Where sort_iter_ and perm_iter_ are iterators for the two BlockVectors. However, for operator-=(), advance(n) is called with a negative n. The iterator of BlockVector didn't take into consideration that its operator+=(n) could be called with a negative n. This PR improves BlockVector and its iterator to take that into consideration, solving the problem.

In addition, this PR

  • adds additional C++ tests of BlockVector
  • switches to the single header variant for Boost tests, so that NEST can be installed with Boost without static libraries
  • moves to the latest Boost version (1.72.0) on Travis

Fixes #1489
Fixes #1239

@heplesser heplesser added I: No breaking change Previously written code will work as before, no one should note anything changing (aside the fix) S: Critical Needs to be addressed immediately T: Bug Wrong statements in the code or documentation labels Apr 7, 2020
@heplesser heplesser added this to the NEST 2.20.1 milestone Apr 7, 2020
@heplesser heplesser self-requested a review April 7, 2020 08:39
Copy link
Contributor

@heplesser heplesser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hakonsbm Thanks for working this out! I have a few questions and suggestions. In some places I am probably just not seeing through things properly ;).

@@ -594,6 +596,7 @@ function( NEST_PROCESS_WITH_BOOST )
set( BOOST_LIBRARIES "${Boost_LIBRARIES}" PARENT_SCOPE )
set( BOOST_INCLUDE_DIR "${Boost_INCLUDE_DIR}" PARENT_SCOPE )
set( BOOST_VERSION "${Boost_MAJOR_VERSION}.${Boost_MINOR_VERSION}.${Boost_SUBMINOR_VERSION}" PARENT_SCOPE )
include_directories( ${Boost_INCLUDE_DIRS} )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a little confused here: On line 597, we have BOOST_INCLUDE_DIR and Boost_INCLUDE_DIR, and here we have Boost_INCLUDE_DIRS (plural). Is this all as it should be, or is there some typo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both Boost_INCLUDE_DIR and Boost_INCLUDE_DIRS will probably point to the same location. However, based on the examples in the documentation, and this answer in the CMake forum, Boost_INCLUDE_DIRS is the correct variable to use.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just updated PR #1477 where I used,
include_directories( ${Boost_INCLUDE_DIR} )

should I change to
include_directories( ${Boost_INCLUDE_DIRS} )
and use
set( BOOST_INCLUDE_DIR "${Boost_INCLUDE_DIRS}" PARENT_SCOPE )
as commented by @hakonsbm ?

And just to add, BOOST_INCLUDE_DIR is a variable set on the scope above the current (PARENT_SCOPE, see manual), and the following code would not work as expected, as it will contain the previous value of tha variable.
include_directories( ${BOOST_INCLUDE_DIR} )

@@ -247,6 +260,9 @@ class BlockVector
*/
int get_max_block_size() const;

// TODO: To make BlockVector a complete random access container, it should also implement
// max_size(), rbegin(), and rend().
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should either be done within this PR if straightforward or turned into a follow-up issue, but not stay as a "TODO" comment. If turned into a follow-up issue, would it be an idea to implement these three methods now to just throw an exception?

Copy link
Contributor Author

@hakonsbm hakonsbm Apr 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have created function definitions that throw exceptions and removed the "TODO".

@@ -606,6 +640,13 @@ operator-( const const_iterator& other ) const
return ( block_index_ - other.block_index_ ) * max_block_size + ( this_element_index - other_element_index );
}

template < typename value_type_, typename ref_, typename ptr_ >
inline typename bv_iterator< value_type_, ref_, ptr_ >::reference bv_iterator< value_type_, ref_, ptr_ >::operator[](
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused: Stripping the template arguments, this looks to me like bv_iterator::operator[]. But do we really want to apply array indexing to the iterator? I'd expected it used on the container. But I may be overlooking something here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For iterators, the operator[] is the offset dereference operator, which gets the dereferenced element shifted n places relative to the iterator position. It is required for random access iterators.

reference.push_back( i );
}
}
~bv_vec_reference_fixture()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Insert empty line above

@@ -260,6 +283,119 @@ BOOST_AUTO_TEST_CASE( test_iterator_compare )
BOOST_REQUIRE( not( it_b < it_a ) );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to add tests for all comparison operators here?

}
BOOST_REQUIRE( std::is_sorted( bv_sort.begin(), bv_sort.end() ) );
BOOST_REQUIRE( std::is_sorted( bv_perm.begin(), bv_perm.end() ) );

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment explaining why you perform both the is_sorted and the equal tests?

@@ -27,78 +27,151 @@
#include <boost/test/unit_test.hpp>

// C++ includes:
#include <algorithm>
#include <vector>

// Includes from libnestutil:
#include "sort.h"

namespace nest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is everything put in namespace nest?


BOOST_AUTO_TEST_SUITE( test_sort )

/**
* Tests whether two arrays with randomly generated numbers are sorted
* correctly by a single call to sort.
* correctly when sorting with the built-in quicksort.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"built-in" == NEST's own?

nest_quicksort( bv0, bv1 );
/**
* Tests whether two arrays with randomly generated numbers are sorted
* correctly when sorting with Boost.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We know that Boost switches sort algorithms between small and large containers, with a boundary at 1000 elements according to Boost::sort documentation (see boost::sort::spreadsort::detail::min_sort_size defined in boost/sort/spreadsort/detail/constants.hpp). The tests here should cover both cases.

@heplesser heplesser requested review from gtrensch and jougs and removed request for jougs April 7, 2020 21:38
@hakonsbm
Copy link
Contributor Author

@heplesser Thanks for your review! I have addressed your comments, please have another look.

Copy link
Contributor

@heplesser heplesser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hakonsbm Nicely done!

Copy link
Contributor

@gtrensch gtrensch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!
Please see my comment regarding downloading boost in travis_build.sh.

cp -fr boost_1_72_0 $HOME/.cache
rm -fr boost_1_72_0
CONFIGURE_BOOST="-Dwith-boost=$HOME/.cache/boost_1_72_0"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above handles the boost library differently than, for example, music or sionlib. Is there a specific reason for this? If not, wouldn't it be better to implement this the same way? I would suggest to hide the above in a file extras/install_boost.sh, adjust .travis.yaml accordingly and wrap boost with an environment variable, same as it is for all other libraries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gtrensch The idea was to always compile with Boost, like Travis currently does. But I can implement it in a way similar to the other libraries. Should some Travis jobs not compile with Boost?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hakonsbm The static code check (the first build configuration in the travis build matrix) does not require libboost. I will make a suggestion for a change and create a PR against your branch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hakonsbm I have created the PR, please check.

niltonlk added a commit to niltonlk/nest-simulator that referenced this pull request Apr 20, 2020
Handle libboost same as all other libraries
Copy link
Contributor

@gtrensch gtrensch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! Approving!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I: No breaking change Previously written code will work as before, no one should note anything changing (aside the fix) S: Critical Needs to be addressed immediately T: Bug Wrong statements in the code or documentation
Projects
Status: Done
4 participants