Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use caching_allocator for async allocation/deallocation in HIP #20186

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lpy
Copy link

@lpy lpy commented Mar 6, 2025

Previously hip allocator maintains its own async allocation and free operation. This patch moves the calls to use allocation/deallocation in caching_allocator instead.

This patch creates a wrapped_buffer structure, where it captures the real underlying hal_buffer, which will be allocated and deallocated via caching_allocator. When using the buffer, instead of assuming it's a hip_buffer, we should check whether it's a wrapped_buffer and acquire the underlying buffer instead before any operation.

BUG: #20134

@lpy
Copy link
Author

lpy commented Mar 6, 2025

@AWoloszyn PTAL

Copy link
Collaborator

@benvanik benvanik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking for now because this has some design issues that will need to be worked out. Will sync up with @AWoloszyn in the morning.


static const iree_hal_buffer_vtable_t iree_hal_wrapped_buffer_vtable;

bool iree_hal_wrapped_buffer_isa(iree_hal_buffer_t* base_value) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unlikely to work in cases where dynamic linking is involved - we ran into issues with this approach previously and had to back it out. The vtable pointer in one library will not match the vtable pointer in another if they both have the vtable compiled in. That may be ok for the extremely limited case of an internal type, but not for a generic type like this.


typedef struct iree_hal_wrapped_buffer_t iree_hal_wrapped_buffer_t;

bool iree_hal_wrapped_buffer_isa(iree_hal_buffer_t *base_value);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style here and elsewhere: asterisks stay with the type

Suggested change
bool iree_hal_wrapped_buffer_isa(iree_hal_buffer_t *base_value);
bool iree_hal_wrapped_buffer_isa(iree_hal_buffer_t* base_value);

@@ -795,6 +795,10 @@ static iree_status_t iree_hal_caching_allocator_export_buffer(
out_external_buffer);
}

bool iree_hal_caching_allocator_isa(iree_hal_allocator_t* base_value) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this definitely won't work (if the wrapped buffer was an implementation detail maybe, but allocators are not). Design-wise, no one should have to know if a caching allocator is used.

@@ -30,10 +30,12 @@
#include "iree/hal/drivers/hip/rccl_dynamic_symbols.h"
#include "iree/hal/drivers/hip/status_util.h"
#include "iree/hal/drivers/hip/stream_command_buffer.h"
#include "iree/hal/utils/caching_allocator.h"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allocators are injected and cannot be depended on here - this creates a strong dependency on the allocator library and we cannot have that. There should not be a need to know what the allocator is providing the buffers, and there may be multiple nested allocators (you can wrap a debug allocator around a caching allocator and then around a base allocator).

/*device_ptr=*/NULL, /*host_ptr=*/NULL, callback, device->host_allocator,
&buffer);
iree_status_t status = iree_ok_status();
bool is_hip_allocator = iree_hal_hip_allocator_isa(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above, this is not a check you can make.

Previously hip allocator maintains its own async allocation and free operation.
This patch moves the calls to use allocation/deallocation in caching_allocator
instead.

This patch creates a wrapped_buffer structure, where it captures the real
underlying hal_buffer, which will be allocated and deallocated via
caching_allocator. When using the buffer, instead of assuming it's a hip_buffer,
we should check whether it's a wrapped_buffer and acquire the underlying buffer
instead before any operation.

BUG: iree-org#20134
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants