537: Endpoint to list frontend users #554

Vechtomov · 2023-01-08T21:08:47Z

Resolves #537

Vechtomov · 2023-01-08T23:19:37Z

I've left a couple of questions here #537, don't forget to consider them during review

backend/oasst_backend/api/v1/frontend_users.py

nil-andreu · 2023-01-09T21:39:10Z

backend/oasst_backend/user_repository.py

+            users = users.filter(User.auth_method == auth_method)
+
+        if ge:
+            users = users.filter(User.display_name >= ge)


Which is the display name of a User? I think that is a string, so how is comparing a string greater than ge?

that is indeed possible and intended here: https://www.educba.com/postgresql-compare-strings/

Strings can be compared in alphabetical order and PostgreSQL supports string comparison operators. https://stackoverflow.com/questions/40601324/comparing-strings-in-postgres-using-comparison-operators

So we are comparing alphabetically? The display_name creation does follow an order when created?

Yes, we are comparing alphabetically. I'm not sure what you mean by following an order when created

I mean I do not understand which could be the use case in which we might want to compare alphabetically.

We're also trying to figure it out in the main thread :)

yk

thanks for the PR, I left a few comments.

Regarding your questions:

yes, for now I'd also accept an api_id like the other method. It's kinda ugly but only trusted frontends can submit one anyway
the endpoints returning a list of users is fine
I'm really not a fan of the username string comparisons here. as far as I can tell, we don't even have an index to support a prefix query here (although maybe the composite index would do the work). I'd be much more in favor of a classic limit and offset pagination, instead of the lt, gt, and max_count. @olliestanley is there a particular reason we're doing the prefix queries?

yk · 2023-01-09T22:42:19Z

backend/oasst_backend/api/v1/utils.py

@@ -39,3 +39,7 @@ def prepare_tree(tree: list[Message], tree_id: UUID) -> protocol.MessageTree:
        tree_messages.append(prepare_message(message))

    return protocol.MessageTree(id=tree_id, messages=tree_messages)
+
+
+def prepare_user(u: User) -> protocol.User:


I don't think this is well-named, I'd go with something descriptive like db_user_to_protocol_user or you could add a @staticmethod to the protocol user class like from_db_user to construct one from a db user. Or a to_protocol_user method on the db user class that transforms the db user into a protocol user. I think I'd go for the last option, since the second would create a back-dependency from shared to backend, and doesn't result in a separate extra lonely function like here.

yk · 2023-01-09T22:44:22Z

backend/oasst_backend/user_repository.py

+        if lt:
+            users = users.filter(User.display_name < lt)
+
+        users = users.order_by(User.display_name)


I know the query planner will probably take care of this, but could we do the sorting before the lt and gt comparisons?

yk · 2023-01-09T22:44:42Z

backend/oasst_backend/user_repository.py

+        if ge:
+            users = users.filter(User.display_name >= ge)


this parameter should really be called gte in this case

olliestanley · 2023-01-10T08:40:32Z

I'm really not a fan of the username string comparisons here. as far as I can tell, we don't even have an index to support a prefix query here (although maybe the composite index would do the work). I'd be much more in favor of a classic limit and offset pagination, instead of the lt, gt, and max_count. @olliestanley is there a particular reason we're doing the prefix queries?

@andreaskoepf would have to confirm on the reasoning for this, but I assumed it was for cases like wanting to lookup a user by username without knowing auth_method or api_client_id, so you could use these criteria to list all users with a certain username regardless of auth_method, api_client_id

yk · 2023-01-10T09:53:15Z

@andreaskoepf would have to confirm on the reasoning for this, but I assumed it was for cases like wanting to lookup a user by username without knowing auth_method or api_client_id, so you could use these criteria to list all users with a certain username regardless of auth_method, api_client_id

but then let's add a prefix filter (w/ corresponding index). probably the majority of use cases of the endpoint will be listing and paging, and it seems quite cumbersome to do that with the current implementation, especially when we impose some upper limit on the number of users returned (which we have to).

Vechtomov · 2023-01-10T09:58:18Z

Hi, @yk. Thanks for the good review. I've made changes but I still don't understand why we need to pass api_client_id for untrusted clients. We pass api_key with the request and it uniquely defines api_client.

yk · 2023-01-10T10:51:44Z

Hi, @yk. Thanks for the good review. I've made changes but I still don't understand why we need to pass api_client_id for untrusted clients. We pass api_key with the request and it uniquely defines api_client.

I think in your question, you referred to another endpoint where this is done. The reasoning there was the following: a trusted api key should be able to also see users of other keys, and would pass those keys along, i.e. saying "give me users of this other key". I honestly don't see a super good reason why we might do this, so I'd be comfortable leaving that away and just saying trusted frontends can just read globally.

andreaskoepf · 2023-01-12T21:22:24Z

I am bit irritated by the paging discussion here.

I deliberately asked to select by interval-starting point + limit. Naive paging with offset and limit is not scalable. Selecting result elements BY OFFSET (!) means in many cases the db-engine runs a linear search over the results until it reaches the indices of the index-window to return. For larger datasets and high offsets this has very bad performance characteristics.

Regarding indices: BTREE indices store strings ordered (based on the configured collation) and are always used string comparisions == <= < > >= etc are used .. query engine will also use them for LIKE "prefix%" conditions.

github-actions · 2023-01-12T21:36:38Z

❌ pre-commit failed.
Please run pre-commit run --all-files locally and commit the changes.
Find more information in the repository's CONTRIBUTING.md

andreaskoepf

It should probably be protocol.FrontEndUser instead of protocol.User .. but I can do the fix quickly. Will merge now.. I think the paging-strategy has to be discussed. To me naive paging is the noob option .. but you guys all seem to love it. We should just add naive index paging everywhere and we will see what happens.

Vechtomov added 3 commits January 9, 2023 00:06

added frontend users endpoint

254b400

fix comparison

c6fbc26

added api_client_id filtration

847095d

Vechtomov changed the title ~~[WIP] 537: Endpoint to list frontend users~~ 537: Endpoint to list frontend users Jan 8, 2023

Vechtomov marked this pull request as ready for review January 8, 2023 23:20

Vechtomov requested review from yk and andreaskoepf as code owners January 8, 2023 23:20

fozziethebeat added the backend label Jan 9, 2023

fozziethebeat added this to the Admin MVP milestone Jan 9, 2023

allow untrusted api-clients

ecf41a0

nil-andreu suggested changes Jan 9, 2023

View reviewed changes

yk reviewed Jan 9, 2023

View reviewed changes

review fixes

6f71189

Merge branch 'main' into frontend-users-list

c5a35d4

Merge branch 'main' into frontend-users-list

c4d1c80

Update user.py

0fd5dce

andreaskoepf approved these changes Jan 12, 2023

View reviewed changes

andreaskoepf merged commit 0d646e7 into LAION-AI:main Jan 12, 2023

Vechtomov deleted the frontend-users-list branch January 13, 2023 18:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

537: Endpoint to list frontend users #554

537: Endpoint to list frontend users #554

Vechtomov commented Jan 8, 2023

Vechtomov commented Jan 8, 2023 •

edited

Loading

nil-andreu Jan 9, 2023

yk Jan 9, 2023

Vechtomov Jan 9, 2023

nil-andreu Jan 10, 2023

Vechtomov Jan 10, 2023

nil-andreu Jan 10, 2023

Vechtomov Jan 10, 2023

yk left a comment

yk Jan 9, 2023

Vechtomov Jan 10, 2023

yk Jan 9, 2023

Vechtomov Jan 10, 2023

yk Jan 9, 2023

Vechtomov Jan 10, 2023

olliestanley commented Jan 10, 2023

yk commented Jan 10, 2023

Vechtomov commented Jan 10, 2023

yk commented Jan 10, 2023

andreaskoepf commented Jan 12, 2023 •

edited

Loading

github-actions bot commented Jan 12, 2023

andreaskoepf left a comment

537: Endpoint to list frontend users #554

537: Endpoint to list frontend users #554

Conversation

Vechtomov commented Jan 8, 2023

Vechtomov commented Jan 8, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

olliestanley commented Jan 10, 2023

yk commented Jan 10, 2023

Vechtomov commented Jan 10, 2023

yk commented Jan 10, 2023

andreaskoepf commented Jan 12, 2023 • edited Loading

github-actions bot commented Jan 12, 2023

andreaskoepf left a comment

Choose a reason for hiding this comment

Vechtomov commented Jan 8, 2023 •

edited

Loading

andreaskoepf commented Jan 12, 2023 •

edited

Loading