Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Several fixes for IndexType which are not of integral types #154

Merged
merged 4 commits into from
Oct 9, 2021

Conversation

dav1d-wright
Copy link
Contributor

@dav1d-wright dav1d-wright commented Aug 12, 2021

Hi,

I have attempted to use nanoflann for data structures where the access to the elements is not provided using indexes of integral types, or more efficiently provided by other accessors. While doing so I have found certain issues, most of which arose due to mixing up the type of the values stored in KDTreeBaseClass::vind and the type with which vind is accessed.

In many places this worked fine, as the values stored in vind were typically of integral types well, since they are used to quantify indices in an array or a similar data structure. However, with these fixes nanoflann can be used with data structures that provide access to its elements using other types of accessors (such as pointers for example).

Here is a quick summary of the changes:

  • Metrics adaptors (L1_Adaptor, L2_Adaptor, ...) did not have a template parameter for IndexType. They expected b_idx to be of size_t, and by doing so casted implicitly to size_t, if possible.
  • In several places, IndexType was used to store the location in vind in which the IndexType is stored, whereas the type for the argument of std::vector<IndexType>::operator[] has nothing to do with IndexType itself. E.g. KDTreeBaseClass::Node::node_type, KDTreeBaseClass::divideTree.
  • Since the argument of std::vector<IndexType>::operator[] is not a template parameter, uint64_t was used
  • As the IndexType is not necessarily an index, I renamed it to AccessorType.

I built and ran all tests as well as the examples (all build targets), each of which passed.

I hope you find these changes useful and will merge them into your repository. If there is anything that I should change, don't hesitate to let me know.

David Wright added 2 commits August 12, 2021 15:33
…es that are stored in KDTreeBaseClass::vind with the type of the indices with which the values therein are accessed.

In many places this worked fine, as the values stored in vind are of integral types well, since they are used to  quantified indices in an array or a similar data structure. However, With these fixes nanoflann can be used with data structures that provide access to its elements using other types of accessors (such as pointers for example).

- Metrics adaptors (L1_Adaptor, L2_Adaptor, ...) did not have a template parameter for IndexType. They expected b_idx to be of size_t, and by doing so always casted implicitly to size_t, if possible.
- In several places, IndexType was used to store the location in vind in which the IndexType is stored, whereas the type for the argument of std::vector<IndexType>::operator[] has nothing to do with IndexType itself. E.g.  KDTreeBaseClass::Node::node_type, KDTreeBaseClass::divideTree.
- Since the argument of std::vector<IndexType>::operator[] is not a template parameter, uint64_t was used
Copy link
Owner

@jlblancoc jlblancoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Apart of my comments, please, add the changes to the Changelog file (even with a link to this PR URL?)

@@ -486,17 +492,18 @@ struct SO2_Adaptor {
* \tparam _DistanceType Type of distance variables (must be signed) (e.g.
* float, double)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, add doxygen docs for the new AccessorType here too?

}

void computeMinMax(const Derived &obj, IndexType *ind, IndexType count,
void computeMinMax(const Derived &obj, uint64_t ind, uint64_t count,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this uint64_t be AccessorType too? Or if it's not, another new symbolic name?
(I'm doing a quick review, sorry if the answer is obvious)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my understanding, what was done here is that a pointer to IndexType was used in order to be able to use the element to which ind points to, which is located in vInd (or now vAcc) as the element 0 of an array. However as pointer arithmetic is quite unsafe, I thought it would be somewhat safer and more understandable to explicitly access vAcc with ind as an offset, as we do have access to the array here. By using the std::vector::operator[] explicitly, range checks are at least performed in debug mode.

}
}
} else {
IndexType idx;
uint64_t idx;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here, and basically, everywhere where uint64_t shows up... it lights a red light for me seen a hard-wired type instead of a typename parameter, a "using A=B;", etc. ;-)

return m_data_matrix.get().rows();
else
return m_data_matrix.get().cols();
}

// Returns the dim'th component of the idx'th point in the class:
inline num_t kdtree_get_pt(const IndexType idx, size_t dim) const {
if(row_major)
if (row_major)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a .clangformat at some point... but that's for another PR!

- Created type alias for Dimension, Size, Offset
- Use 32bit types for these aliases, as well as default template arguments for AccessorType
- Added doxygen comments for template arguments for metrics adaptors
@dav1d-wright
Copy link
Contributor Author

@jlblancoc Thank you for reviewing my changes! I have addressed your comments, I hope this is roughly what you had in mind. Once these changes are accepted I am more than happy to create a PR with a .clang-format file. I also have a .pre-commit-config.yaml that you could use to to install a pre-commit hook which automatically executes clang-format before each commit.

@dav1d-wright
Copy link
Contributor Author

Also, I am not sure if this is of interest to you as it is a modification of the API, but for our code I prefer to have a unified signature for both, knnSearch and radiusSearch, and therefore changed KNNResultSet to adhere to the same API as RadiusResultSet.

@dav1d-wright
Copy link
Contributor Author

Hi @jlblancoc , did you have a chance to look at my changes with respect to your review comments?

@jlblancoc jlblancoc merged commit 431bd40 into jlblancoc:master Oct 9, 2021
@jlblancoc
Copy link
Owner

Awesome, great contribution, thanks @dav1d-wright !

I'm just curious: in what context would you find useful using e.g. double as an index?? :-) Perhaps I'm so biased by common applications to robotics point clouds.

@dav1d-wright
Copy link
Contributor Author

dav1d-wright commented Oct 12, 2021

Thank you very much for merging my pull request @jlblancoc! Very glad you like my proposed changes.

Actually the main motivation for providing AccessorType functionality that is of non-integral type was to enable indexing storage that does not lie in contiguous memory, in which case we use pointers to access the underlying data. We haven't used this with floating point numbers at this point, but I am convinced someone could find a use case for this ;)

std::vector<size_t> ret_index(num_results);
std::vector<uint32_t> ret_index(num_results);
Copy link
Contributor

@dschwen dschwen Jan 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm puzzled by this change (or rather the change that makes this change necessary). For us it breaks backwards compatibility libMesh/libmesh#3122

Copy link

@cstamatopoulos cstamatopoulos Jan 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aditionally to that, for msvc there are warnings due to conversion loss with size_t when using the default AccessorType = uint32_t value.

1>C:\dev\GSI\Vstars\packages\nanoflann.1.4.2.14\lib\native\include\nanoflann.hpp(1497,1): warning C4267: '=': conversion from 'size_t' to '_Ty', possible loss of data 1> with 1> [ 1> _Ty=std::_Vbase 1> ]

For KDTreeSingleIndexAdaptor settings the typename AccessorType = uint32_t to size_t leads to some different conversion errors which seem to imply that internally a uint32_t is being used somewhere.

1>C:\dev\GSI\Vstars\packages\nanoflann.1.4.2.14\lib\native\include\nanoflann.hpp(1561,1): warning C4267: 'argument': conversion from 'size_t' to 'const AccessorType', possible loss of data 1> with 1> [ 1> AccessorType=uint32_t 1> ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants