-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Forbid standalone usage of <-> operator using statement AST #56
Comments
Wouldn't this mean that if there is an ORDER BY <-> but the index is not triggered, e.g., because the cost estimator determined that it would be more cost effective to do a sequential search, then an error would be thrown? |
It's a bit strange to me that I can't do <-> in SELECT. I did this a few times when testing pgvector to sanity check that I was getting the nearest vectors. |
Yes that is true, it will throw an error in case of sequential scans, but here is some backstory why we decided to implement this feature. We wanted to make the UX better by maintaining only one operator which will automatically determine which kind of distance function should be called. Currently in alternative solutions there is an separate operator for each distance function (e.g <->, <=>, <~>) so you should remember which operator to use for particular index. So at this moment we decided to forbid the usage of the operator, so it will not be confusing that let's say sometimes the operator returns We can discuss this topic further with @Ngalstyan4 , maybe we can come up with a better solution. |
* Fix sample size to take from test table count * Fix data type
Currently we have only one operator
<->
In
src/hnsw/options.c
file there's a functionHnswGetMetricKind
which will determine current operator class being used with and index and detect right metric kind forusearch
index using the support function pointers.This is great as we can have only one operator which will support various distance functions, but when used out of index scope for example in
SELECT
statement, the operator can not automatically detect which distance function should be used.We are currently throwing an error when
<->
is used out of index lookup. We are doing this usingExecutorStart_hook
the hook implementation is defined insrc/hnsw/options.c
void executor_hook
.This function receives QueryDesc struct, and we are currently doing regexp matching on
sourceText
. This approach is not covering cases when the operator will be used withORDER BY
, but there won't be an index scan.To fix all the cases we might use plannedstmt which contains the AST of planned statement, where we can find information about the indexes and much more.
After doing this changes theres
hnsw_operators_todo.sql
test file. The file should be renamed tohnsw_operators.sql
and included inschedule.txt
fileThe text was updated successfully, but these errors were encountered: