-
Notifications
You must be signed in to change notification settings - Fork 5.5k
fix: Support more type conversion for decimal partition value #26240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Reviewer's GuideEnhanced DECIMAL partition value deserialization in PartitionData.getValue with type-specific handling and added comprehensive unit tests covering all numeric input scenarios and JSON round-trip cases. Flow diagram for DECIMAL partition value deserialization logicflowchart TD
A["partitionValue.isLong()"] -->|true| B["BigDecimal.valueOf(partitionValue.asLong(), scale)"]
A -->|false| C["partitionValue.isInt()"]
C -->|true| D["BigDecimal.valueOf(partitionValue.asInt(), scale)"]
C -->|false| E["partitionValue.isBigInteger()"]
E -->|true| F["new BigDecimal(partitionValue.bigIntegerValue(), scale)"]
E -->|false| G["partitionValue.decimalValue().setScale(scale)"]
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- In the DECIMAL branch, extract the DecimalType cast and its scale into local variables up front to avoid repeated casts and improve readability.
- The fallback uses setScale without a RoundingMode, which will throw if the input has more fractional digits than the target scale—consider specifying an explicit RoundingMode or validating precision to make deserialization behavior clearer.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In the DECIMAL branch, extract the DecimalType cast and its scale into local variables up front to avoid repeated casts and improve readability.
- The fallback uses setScale without a RoundingMode, which will throw if the input has more fractional digits than the target scale—consider specifying an explicit RoundingMode or validating precision to make deserialization behavior clearer.
## Individual Comments
### Comment 1
<location> `presto-iceberg/src/main/java/com/facebook/presto/iceberg/PartitionData.java:197-187` </location>
<code_context>
+ else if (partitionValue.isBigInteger()) {
+ return new BigDecimal(partitionValue.bigIntegerValue(), ((DecimalType) type).scale());
+ }
+ else {
+ return partitionValue.decimalValue().setScale(((DecimalType) type).scale());
+ }
}
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider specifying rounding mode when setting scale on BigDecimal.
Omitting the rounding mode can cause ArithmeticException if the value cannot be exactly represented at the desired scale. Specify a rounding mode to prevent this.
Suggested implementation:
```java
return partitionValue.decimalValue().setScale(((DecimalType) type).scale(), java.math.RoundingMode.HALF_UP);
```
If `RoundingMode` is not already imported at the top of the file, add:
```java
import java.math.RoundingMode;
```
You may want to choose a different rounding mode depending on your requirements.
</issue_to_address>
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
catch (IOException e) { | ||
throw new UncheckedIOException("Failed during JSON conversion of " + partitionValue, e); | ||
} | ||
case DECIMAL: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (bug_risk): Consider specifying rounding mode when setting scale on BigDecimal.
Omitting the rounding mode can cause ArithmeticException if the value cannot be exactly represented at the desired scale. Specify a rounding mode to prevent this.
Suggested implementation:
return partitionValue.decimalValue().setScale(((DecimalType) type).scale(), java.math.RoundingMode.HALF_UP);
If RoundingMode
is not already imported at the top of the file, add:
import java.math.RoundingMode;
You may want to choose a different rounding mode depending on your requirements.
34e41e3
to
8b634d9
Compare
@hantangwangd Not sure if you are interested in reviewing this PR, this PR is motivated when reviewing PR #26174 (comment) and I was suggested to move this part and submit a new PR. |
@aditi-pandit Can you please take a look. This PR is per your comments in #26174 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PingLiuPing have you ever encountered this problem during real usage? If so, could you also add a similar test case which would fail without this change in IcebergDistributedTestBase
or IcebergDistributedSmokeTestBase
? Thanks.
@hantangwangd Thanks for the comment. |
@hantangwangd Just want to split this piece of code separately. The purpose of this PR is serving the native iceberg write. I have test cases to cover this from E2E point of view. And I will deliver those testcases once the partition transform related code is merged. |
Description
Previously, PartitionData.getValue() always used decimalValue().setScale()
when deserializing DECIMAL partition values from JSON, which could fail
or produce incorrect results when the JSON contains integer representations
(long, int, or BigInteger) of decimal values.
This change adds type-specific handling for decimal deserialization:
Motivation and Context
Impact
Test Plan
Added comprehensive unit tests covering all code paths including:
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.