-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix](parquet)Fix the be core issue when reading parquet unsigned types. #39926
[fix](parquet)Fix the be core issue when reading parquet unsigned types. #39926
Conversation
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
run buildall |
Possible file(s) that should be tracked in LFS detected: 🚨The following file(s) exceeds the file size limit:
Consider using |
clang-tidy review says "All clean, LGTM! 👍" |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
TPC-H: Total hot run time: 38574 ms
|
TPC-DS: Total hot run time: 187957 ms
|
ClickBench: Total hot run time: 30.69 s
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…es. (apache#39926) ## Proposed changes Since Doris does not have an unsigned type, we convert parquet uint32 type to doris bigint (int64) type. When reading the parquet file, the byte size stored in parquet and the byte size of the data type mapped by doris are inconsistent, resulting in be core. Fix: When reading, we read according to the byte size stored in parquet, and then convert it to the data type mapped by doris. Mapping relationship description: parquet -> doris UInt8 -> Int16 UInt16 -> Int32 UInt32 -> Int64 UInt64 -> Int128.
…es. (#39926) ## Proposed changes Since Doris does not have an unsigned type, we convert parquet uint32 type to doris bigint (int64) type. When reading the parquet file, the byte size stored in parquet and the byte size of the data type mapped by doris are inconsistent, resulting in be core. Fix: When reading, we read according to the byte size stored in parquet, and then convert it to the data type mapped by doris. Mapping relationship description: parquet -> doris UInt8 -> Int16 UInt16 -> Int32 UInt32 -> Int64 UInt64 -> Int128.
Proposed changes
Since Doris does not have an unsigned type, we convert parquet uint32 type to doris bigint (int64) type.
When reading the parquet file, the byte size stored in parquet and the byte size of the data type mapped by doris are inconsistent, resulting in be core.
Fix:
When reading, we read according to the byte size stored in parquet, and then convert it to the data type mapped by doris.
Mapping relationship description:
parquet -> doris
UInt8 -> Int16
UInt16 -> Int32
UInt32 -> Int64
UInt64 -> Int128.