-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Iceberg table properties building #25183
base: master
Are you sure you want to change the base?
Improve Iceberg table properties building #25183
Conversation
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java
Outdated
Show resolved
Hide resolved
efea506
to
46c61e3
Compare
Iterable<FileScanTask> files = () -> lazyFiles.get().iterator(); | ||
Iterable<PartitionStruct> partitionStructs = () -> lazyFiles.get().stream() | ||
.map(PartitionStruct::new) | ||
.distinct() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StructLike
does not implement equals
and hashCode
. I think we need to use StructLikeWrapperWithFieldIdToIndex
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will use a LinkedHashSet internally that also uses PartitionSpec
for de-dup and holds on to that object.
Since checking the partition values is sufficient, it would be better to explicitly create a HashSet on StructLikeWrapperWithFieldIdToIndex and write something similar to io.trino.plugin.iceberg.PartitionsTable#getStatisticsByPartition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 updated with StructLikeWrapperWithFieldIdToIndex
b3f40bd
to
b6f21f6
Compare
When building Iceberg table properties with many files under one partition, process this partition only once
b6f21f6
to
90bfe5a
Compare
Description
When building Iceberg table properties with many files under one partition, process this partition only once
Additional context and related issues
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text: