Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make Geometry comparable #25225

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

justin2004
Copy link

@justin2004 justin2004 commented Mar 5, 2025

In PostGIS, instances of Geometry are comparable (that is, you can run select distinct on those columns).
This PR allows Trino to do the same and addresses #24961.

e.g. with the patch applied:

trino> describe postgis.public.movement;                            
     Column      |     Type     | Extra | Comment 
-----------------+--------------+-------+---------
 id              | integer      |       |         
 thing_id        | integer      |       |         
 geometry        | Geometry     |       |         
 geometry_string | varchar(50)  |       |         
 dt              | timestamp(6) |       |          
 speed_mph       | integer      |       |         
(6 rows)                                                           

Query 20250305_174904_00016_abpcz, FINISHED, 1 node                                                                                    
Splits: 19 total, 19 done (100.00%)
0.40 [6 rows, 364B] [14 rows/s, 901B/s]
trino> select geometry from postgis.public.movement;
      geometry       
---------------------
 POINT (-90 38.5)    
 POINT (-90 38.501)  
 POINT (-90 38.5)    
 POINT (-90.1 38.58) 
(4 rows)

Query 20250305_174913_00017_abpcz, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.23 [4 rows, 0B] [17 rows/s, 0B/s]

trino> select distinct geometry from postgis.public.movement;
      geometry       
---------------------
 POINT (-90 38.501)  
 POINT (-90 38.5)    
 POINT (-90.1 38.58) 
(3 rows)

Query 20250305_174919_00018_abpcz, FINISHED, 1 node
Splits: 1 total, 1 done (100.00%)
0.26 [3 rows, 0B] [11 rows/s, 0B/s]

EDIT:
I got the formatter to run (using intellij).

Copy link

cla-bot bot commented Mar 5, 2025

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Justin Dowdy.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

Copy link

cla-bot bot commented Mar 5, 2025

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

so we can do `select distinct` on columns of that type.
postgis allows that.
used google-java-format
using intellij
@justin2004 justin2004 force-pushed the geometry-comparable branch from f4ee92f to 1417e95 Compare March 5, 2025 21:43
Copy link

cla-bot bot commented Mar 5, 2025

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@justin2004
Copy link
Author

I've emailed the signed CLA.

Copy link

cla-bot bot commented Mar 5, 2025

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

public class GeometryType
extends AbstractVariableWidthType
{
public static final GeometryType GEOMETRY = new GeometryType();

private static final TypeOperatorDeclaration TYPE_OPERATOR_DECLARATION =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be

private static final TypeOperatorDeclaration TYPE_OPERATOR_DECLARATION = extractOperatorDeclaration(GeometryType.class, lookup(), Slice.class);

See UuidType for an example.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this but then the test failed:

[ERROR] Tests run: 4, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 34.16 s <<< FAILURE! -- in io.trino.plugin.geospatial.TestGeoSp
atialQueries                                                                                                                           
[ERROR] io.trino.plugin.geospatial.TestGeoSpatialQueries.testDistinctGeometry -- Time elapsed: 9.717 s <<< ERROR!                      
io.trino.testing.QueryFailedException: io.trino.spi.TrinoException: Geometry READ_VALUE operator can not be adapted to convention (([FL
AT])FAIL_ON_NULL). Available implementations: [([BLOCK_POSITION_NOT_NULL])FAIL_ON_NULL, ([NEVER_NULL])BLOCK_BUILDER]                   
        at io.trino.testing.AbstractTestingTrinoClient.execute(AbstractTestingTrinoClient.java:138)                                    
        at io.trino.testing.DistributedQueryRunner.executeInternal(DistributedQueryRunner.java:565)                                    
        at io.trino.testing.DistributedQueryRunner.execute(DistributedQueryRunner.java:548)                                            
        at io.trino.sql.query.QueryAssertions$QueryAssert.lambda$new$1(QueryAssertions.java:317)                                       
        at com.google.common.base.Suppliers$NonSerializableMemoizingSupplier.get(Suppliers.java:200)                                   
        at io.trino.sql.query.QueryAssertions$QueryAssert.result(QueryAssertions.java:436)                                             
        at io.trino.plugin.geospatial.TestGeoSpatialQueries.testDistinctGeometry(TestGeoSpatialQueries.java:58) 

}

@Override
public boolean isComparable()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could implement isOrderable() = true as well.

@electrum
Copy link
Member

electrum commented Mar 6, 2025

Can you add tests for this in TestGeoSpatialQueries?

Copy link

cla-bot bot commented Mar 6, 2025

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

Copy link

cla-bot bot commented Mar 6, 2025

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@justin2004
Copy link
Author

justin2004 commented Mar 6, 2025

Can you add tests for this in TestGeoSpatialQueries?

@electrum, done.

before this change the test says:

[ERROR]   TestGeoSpatialQueries.testDistinctGeometry:55 ? QueryFailed line 1:1: DISTINCT can only be applied to comparable types (actua
l: Geometry): geom

wendigo and others added 13 commits March 6, 2025 14:50
- Reuse new module as core for default tarball
- Avoid duplication of artifact set definitions
- Most plugins are not included in the core
- trino-server tarball is essentially unchanged
- Improve content of README in binaries
Looks like a copy-paste error. The latter parts of this assertion use
this one.
Remove nested loop to find ambiguous tables. This optimizes compute from O(n^2) to 2*O(n) when bigquery.case-insensitive-name-matching is enabled
The table column reference was registered incorrectly with the original
case taken from the view definition. It then failed to match the column
schema returned from `SystemAccessControl` and the mask was not applied.

Instead of sprinkling `toLowerCase()` here and there, we will associate
the original `Field` with the column mask and use `Field#canResove` to
do the matching. The problem with this is that there's no way to do
efficient lookups by name in a case-insensitive way, so we have to
iterate the list of `Field`s to find a match.

Fixes trinodb#24054.
In the plugins page in the documentation for now. In a follow up
we can add these links on the doc pages for specific plugins that
we no longer ship with the default binaries.
Copy link

cla-bot bot commented Mar 6, 2025

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to [email protected]. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

7 participants