Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add setExternal in ManagedBytes #1582

Closed
wants to merge 6 commits into from

Conversation

Nicogene
Copy link
Member

@Nicogene Nicogene commented Mar 9, 2018

This PR add setExternal methods in yarp::os::ManagedBytes for the future class VectorMap.
⚠️ wait #1657 befor merging

Please review code.

* Read vector from a connection.
* return true iff a vector was read correctly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

void ManagedBytes::setExternal(const char *extData, size_t len)
{
if (!extData)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the style used in the rest of the code if (...) {

* @param extData, pointer to the external data.
* @param len, num of elements of the external vector
*/
void setExternal (const char* extData, size_t len)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra space after the function name

}
clear();
owned = false;
Bytes extb(const_cast<char*>(extData), len);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering why setExternal accepts const and if this is safe...
The Bytes b is not const, therefore the const char can be modified later by the user by using Vector methods.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be actually the trieckiest part of this PR. The Vector does not check if it owns the memory (with e.g. a bool owned like ManagedBytes) and in fact it should not do it. We must understand whether there is a risk of memory modification or not when the memory is not owned.

@claudiofantacci
Copy link
Collaborator

May you also please the extra spaces in ManagedBytes::operator=()?

}
clear();
owned = false;
Bytes extb(const_cast<char*>(extData), len);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be actually the trieckiest part of this PR. The Vector does not check if it owns the memory (with e.g. a bool owned like ManagedBytes) and in fact it should not do it. We must understand whether there is a risk of memory modification or not when the memory is not owned.

Copy link
Member

@traversaro traversaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if in an "external" vector the resize method is called? Note that this will happen also when a vector of dimension different from the current one is read in the read method.

In general, what is the rationale for this change?

If we need a Vector with external buffer semantics, why do not define a new class VectorView with this semantics, to make it clear at the type level if the used Vector is owning its own memory or not?
As this new class should not use any dynamic memory allocation, then it could be doable to create a VectorView on the fly from a yarp;:sig::Vector or for any other kind of C++ vector (std::vector<double>, std::array<double, ..>, Eigen::MatrixXd) similar (but simpler) to how the Eigen::Map class works (https://eigen.tuxfamily.org/dox/group__TutorialMapClass.html).

Note: the name VectorView is inspired from the std::string_view class that is available since C++17, see https://abseil.io/tips/1 .

cc @S-Dafarra this could be related to our discussion of yesterday.

Related old discussions and comments:

@traversaro
Copy link
Member

Additional comment to notify that I modified my review.

@barbalberto
Copy link
Collaborator

@traversaro The rationale behind this is the same behind the image::setExternal, i.e. there is the need of using some yarp vector features on external memory without copying the data.
The need is to have a way to wrap a PCL pointCloud in yarp in a copy-less fashon.

What happens if in an "external" vector the resize method is called?

If a resize has to be called after the setExternal, problems will arise, exactly like it happens now for the yarp image.
This method is for advanced users who knows what they are doing. Some sanity check can be added, but cannot be 100% safe.

Only operation which do not change the memory allocation are safe to be used.
Using a different type like the vectorWrap will not solve this use case because at compile time we do not know if the user will choose to use an internal or external memory.

@traversaro
Copy link
Member

If a resize has to be called after the setExternal, problems will arise, exactly like it happens now for the yarp image.

I was not aware of this, it sounds like a design flaw in yarp::sig::Image that should be fixed (or at least properly documented, neither Image::resize nor Image::setExternal mention this), rather than a pattern to propagate to other classes.

This method is for advanced users who knows what they are doing. Some sanity check can be added, but cannot be 100% safe.

The problem is that, while this peculiar behaviour has always been the case for yarp::sig::Image, for yarp::sig::Vector I personally always wrote code assuming that yarp::sig::Vector could be resized (and this is indeed suggested by the resize signature, that returns void), so I never taken care to check if the resize could fail (the only possible failure was due to a failure in memory allocation, and typical that is much more catastrophic) . I am afraid that most of the users of yarp::sig::Vector assumed the same.

Using a different type like the vectorWrap will not solve this use case because at compile time we do not know if the user will choose to use an internal or external memory.

Not sure about the specific case you are refereeing to, but any interface that takes in a yarp::sig::Vector without the need to resize it can take a mutable VectorView, as you can easily convert a yarp::sig::Vector to a valid VectorView.

@claudiofantacci
Copy link
Collaborator

I don't know the course of history of YARP development, but since there exist a bool owned in ManagedBytes I guess that, at some point, that particular variable has not been used to control whether a derived class, say yarp::sig::Image used in combination with setExternal, was allowed to manipulate memory or not.

I guess that if we want a clean implementation, we either improve yarp::sig::Vector implementation or we use another class that explicitly does not menage memory. The former solution may be more fruitable by users that are familiar with yarp::sig::Vector, but its implementation will be more complicated and documentation must be improved. The latter solution is probably overall a cleaner solution.

This method is for advanced users who knows what they are doing. Some sanity check can be added, but cannot be 100% safe.

Why is that?

@barbalberto
Copy link
Collaborator

I was not aware of this, it sounds like a design flaw in yarp::sig::Image that should be fixed (or at least properly documented, neither Image::resize nor Image::setExternal mention this), rather than a pattern to propagate to other classes.

I think it is a trade-off between efficiency and robustness. If user wants to be safe, data can be copied instead of calling a setExternal.
The fact you were not aware is a sign that it is not causing too much troubles 😝
I agree documentation shall be improved and better state what you can or can't to after a setExternal.

I don't know the course of history of YARP development, but since there exist a bool owned in ManagedBytes I guess that, at some point, that particular variable has not been used to control whether a derived class, say yarp::sig::Image used in combination with setExternal, was allowed to manipulate memory or not.

Images do not use ManagedBytes, they simply have a char *. Anyway, yes, image could have his own bool owned to perform some additional check. Neither I know why it is not present, maybe it was not added because images rarely are reshaped after the creation.

Some sanity check can be added, but cannot be 100% safe.
Why is that?

Because when someone else owns the memory, he/she can shrink or free that memory regardless of what we do, and there is no way for us to be notified. The next access will then cause a segfault.

Some check can be added on our side, for example to skip the resize when the memory is not owned, but as @traversaro said, users usually do not check if the resize has correctly been performed.

But the user knows if a setExternal has been called, so better inform users that it is not valid to call a resize after it.
I think the only real weapon we have is the documentation.

@claudiofantacci
Copy link
Collaborator

But the user knows if a setExternal has been called, so better inform users that it is not valid to call a resize after it.
I think the only real weapon we have is the documentation.

Improving documentation is always good! 😄
Anyway my two cents here is to also "block" some functions when setExternal is called.

Because when someone else owns the memory, he/she can shrink or free that memory regardless of what we do, and there is no way for us to be notified. The next access will then cause a segfault.

If we are clear about the behavior induced by a setExternal call, then it's the developer responsibility to be aware of this aspect.

Anyway, I'm open to either create a new dedicated class or to improve the class in conjunction with setExternal.

@Nicogene
Copy link
Member Author

Nicogene commented Mar 13, 2018

What about yAssert(owned) inside the functions that manage the memory ? e.g. push_back, pop_back, resize etc...
By default owned will be set to true and changed to false in the moment setExternal is called.

@Nicogene
Copy link
Member Author

But the resizing of the vector that owns the memory is not controllable in my opinion.. so if we want this method and then efficiency, we have to accept some risks, as @barbalberto said it is a trade off...

@barbalberto
Copy link
Collaborator

A well done assert should evaporate when compiled in release mode, therefore if yarp is compiled in release, an application using the vector will not be affected by the assert.
My opinion is that whatever we use as a check, it should work also in release mode

Maybe we can have a f2f chat about which precaution measures to adopt.

@claudiofantacci
Copy link
Collaborator

Maybe we can have a f2f chat about which precaution measures to adopt.

I agree 😄 Can you please schedule one?

@Nicogene
Copy link
Member Author

@barbalberto @traversaro @drdanz @claudiofantacci what about tomorrow in the late morning ?

@traversaro
Copy link
Member

@Nicogene Ok for me, please send a meeting request on Outlook, thanks!

@randaz81
Copy link
Member

randaz81 commented Mar 14, 2018

Only loosely related, but just yesterday I was working with std::vector<bool> and I realized that it does not implement method data() which is available in c++11 (http://www.cplusplus.com/reference/vector/vector/data) for generic std::vector<T>.
The reason is that std::vector<bool> has a specialized implementation for space optimization.(https://stackoverflow.com/questions/46115669/why-does-stdvectorbool-have-no-data).
So beware these kind of issues...

@Nicogene Nicogene changed the title Add setExternal in Vector and VectorOf Add setExternal in ManagedBytes Mar 15, 2018
@Nicogene
Copy link
Member Author

A comment to notify the partecipants that setExternal has been removed from the yarp::sig::Vector* classes, it will be implemented in the VectorMap classes.

@Nicogene Nicogene mentioned this pull request Mar 15, 2018
* @param extData, pointer to the external data.
* @param len, size in bytes of the external blob
*/
void setExternal(const char* extData, size_t len);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 comments here.
First of all, this should not be const char*, but just char, since the data is not owned, but can be modified. Second, I suggest to use const Bytes& ext instead, like in the other constructor, in order to make the interface more coherent

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point but I tried to be as much coherent as possible to yarp::sig::Image::setExternal:

/**
* Use this to wrap an external image.
* Make sure to that pixel type and padding quantum are
* synchronized (you can set these in the FlexImage class).
*/
void setExternal(const void *data, int imgWidth, int imgHeight);

What is the best option ? 😅

@lornat75
Copy link
Member

interesting discussion on setExternal, why not adding a runtime check? I also like @traversaro idea to use dedicated classes for static check, however, it seems difficult to do it now.

@traversaro
Copy link
Member

@Nicogene I know it is not entirely the fault of the newly introduced setExternal method, but can you please updatae the documentation of the allocate and allocatedOnNeed methods when called on ManagedBytes vector that have the owned attribute set to true ?

@Nicogene
Copy link
Member Author

Ok, I will do a cleanup/documentation extra commit!

@traversaro
Copy link
Member

@Nicogene I do not know if you are still interested in implementing the VectorMap class.

However, I discovered that a std::span class with similar semantics is currently being standardized:

As we did in the past for similar classes (such as RecursiveMutex) the best option may be to just introduce a yarp::os::Span class that follows the interfaced of the future std::span, and eventually migrate to the standard one when it is available on all the platforms that we need to support.

@Nicogene
Copy link
Member Author

Nicogene commented Apr 9, 2018

For now it is in standby but I will take a look, thanks!

@Nicogene Nicogene changed the title Add setExternal in ManagedBytes [WIP] Add setExternal in ManagedBytes Apr 24, 2018
@Nicogene
Copy link
Member Author

This PR has to be finished and rebased, and have to take account to the (possible) introduction of move semantics in ManagedBytes (see #1974).

The move semantics + the shared ownership of the memory are an exploding mix if not managed correctly. I experienced it on my skin with the yarp::sig::Image (#1979)

@traversaro
Copy link
Member

@Nicogene as far as I understand the consensus on this was to drop the setExternal(see #1582 (comment)).

If the consensus changed or in any case the decision of adding setExternal to Vector has been take, it would be nice to comment on the rationale behind this choice and how the methods that assume that the memory is owned (such as resize/push_back/...) will be handled.

@Nicogene
Copy link
Member Author

@traversaro if I do remember correctly, we accord to define a separate class called VectorMap to use shared memory, and keep the Vector as it is.
But probably, for the VectorMap class, we need to implement setExternal in ManagedBytes or use another class, std::span-like.

@traversaro
Copy link
Member

But probably, for the VectorMap class, we need to implement setExternal in ManagedBytes or use another class, std::span-like.

As far as I understand, VectorMap will never need to allocate/deallocate the memory that it is pointing to, so I do not know if there is any advantage of using ManagedBytes in it. Probably using yarp::os::Bytes or manually storing pointer and size would be simpler?

By the way, I just realized: if we want to use VectorMap to receive a vector from the network, how do we handle the case in which the vector size is different from the expected? The read will fail?

@Nicogene
Copy link
Member Author

so I do not know if there is any advantage of using ManagedBytes in it. Probably using yarp::os::Bytes or manually storing pointer and size would be simpler

You are right, Bytes should be enought. 👍

Then this PR triggered a nice discussion about memory management and ownership but I think that can be closed now, and open a different one that introduce VectorMap.

@Nicogene Nicogene closed this Apr 12, 2019
@Nicogene Nicogene deleted the addSetExternal branch April 12, 2019 11:38
@Nicogene
Copy link
Member Author

By the way, I just realized: if we want to use VectorMap to receive a vector from the network, how do we handle the case in which the vector size is different from the expected? The read will fail?

I would say yes, but I have not a strong opinion on that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants