-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace fa_cache with htable cache #160
Conversation
src/hnsw/node_cache.h
Outdated
HASHCTL hctl; | ||
} NodeCache; | ||
|
||
NodeCache node_cache_create(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we refactor these to have one cache_
api?
It seems the only thing different between the two caches in this PR is the key and values size.
So, we can pass those in create call and have the rest of the arguments be void*
in the data structure definition.
Maybe we could have a type-specific wrapper around the data structure for each instance (node_cache
and bln_cache
) to get some compiler help. Not sure if this is necessary or useful enough, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
Benchmarks
|
@Ngalstyan4 can you please re-review? I did the requested changes. |
src/hnsw/htab_cache.c
Outdated
{ | ||
MemoryContext old_context = MemoryContextSwitchTo(cache->hctl.hcxt); | ||
hash_destroy(cache->htab); | ||
MemoryContextDelete(cache->hctl.hcxt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you delete the memory context you are currently on?
Is that safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed to switch back to old context before deleting, this was not an issue, but I think it may cause undefined behaviour later.
|
||
void cache_destroy(HTABCache *cache) | ||
{ | ||
MemoryContext old_context = MemoryContextSwitchTo(cache->hctl.hcxt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Memory context ownership is a little strange here.
It seems the memory context is constructed externally, by the caller, but it is destroyed internally, by this abstract data type. Is there a reason for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that if I create memory context with the same name they will share the same memory, but I looked through the source code and it seems name is mostly used on errors to display on which memory context the error was appeared. I think it is not critical for us to share the same name of MemoryContext among different cache instances, so I brought back memory context creation inside cache_create
} HTABCache; | ||
|
||
HTABCache cache_create(char *name, MemoryContext ctx); | ||
bool cache_remove(HTABCache *cache, HTABCacheKey *key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you document the assumptions of this ADT in this file?
E.g. currently the caller creates the memory context, passes ownership to this ADT and this ADT destroys it upon cache_destroy
.
Also answer questions like:
Who owns *key and *entry pointers passed into these API functions? Does the ADT make assumptions about how long those pointers live? (if I understand it correctly, the ADT assumes that anything passed in lives longer than the cache ADT itself. This invariant must be followed by the user of the ADT so this interface file is a good place to document it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comments here
src/hnsw/htab_cache.h
Outdated
#include <utils/hsearch.h> | ||
#include <utils/memutils.h> | ||
|
||
typedef int HTABCacheKey; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should use sized integers always (e.g. int32)
Would be great if we could add a ci/cd check for this kinds of things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to int32 for now, will take a not for CI checks
Replaced
fa_cache
withhtable
cache for nodes when doing inserts and scans.Renamed
Cache
toBlockNumberCache
and createdNodeCache
as the entry and key size differ. Maybe there's a way to make some functions generic, you can mention on PR commentsThis change should improve performance per benchmarks