fix: Support more type conversion for decimal partition value #26240

PingLiuPing · 2025-10-06T12:47:57Z

Description

Previously, PartitionData.getValue() always used decimalValue().setScale()
when deserializing DECIMAL partition values from JSON, which could fail
or produce incorrect results when the JSON contains integer representations
(long, int, or BigInteger) of decimal values.

This change adds type-specific handling for decimal deserialization:

Use BigDecimal.valueOf(long, scale) when JSON contains a long value
Use BigDecimal.valueOf(int, scale) when JSON contains an int value
Use BigDecimal(BigInteger, scale) when JSON contains a BigInteger value
Fall back to decimalValue().setScale() for other numeric types

Motivation and Context

Impact

Test Plan

Added comprehensive unit tests covering all code paths including:

Decimal from long values
Decimal from int values
Decimal from BigInteger values
Decimal from decimal values (fallback)
Edge cases (zero scale, large scale, negative values, very large numbers)
JSON round-trip tests with mixed types

Contributor checklist

Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
Documented new properties (with its default value), SQL syntax, functions, or other functionality.
If release notes are required, they follow the release notes guidelines.
Adequate tests were added if applicable.
CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

Iceberg Connector Changes
* Add more type conversion for decimal partition value.

sourcery-ai · 2025-10-06T12:48:05Z

Reviewer's Guide

Enhanced DECIMAL partition value deserialization in PartitionData.getValue with type-specific handling and added comprehensive unit tests covering all numeric input scenarios and JSON round-trip cases.

Flow diagram for DECIMAL partition value deserialization logic

flowchart TD
    A["partitionValue.isLong()"] -->|true| B["BigDecimal.valueOf(partitionValue.asLong(), scale)"]
    A -->|false| C["partitionValue.isInt()"]
    C -->|true| D["BigDecimal.valueOf(partitionValue.asInt(), scale)"]
    C -->|false| E["partitionValue.isBigInteger()"]
    E -->|true| F["new BigDecimal(partitionValue.bigIntegerValue(), scale)"]
    E -->|false| G["partitionValue.decimalValue().setScale(scale)"]

File-Level Changes

Change	Details	Files
Enhanced DECIMAL partition value deserialization with type-specific handling	Handle JSON long nodes via BigDecimal.valueOf(long, scale) Handle JSON int nodes via BigDecimal.valueOf(int, scale) Handle JSON BigInteger nodes via new BigDecimal(BigInteger, scale) Fallback to decimalValue().setScale() for other numeric types	`presto-iceberg/src/main/java/com/facebook/presto/iceberg/PartitionData.java`
Added comprehensive unit tests for decimal partition value conversion and JSON round-trip	Test null input handling across types Test decimal from long, int, BigInteger and fallback paths Validate edge cases: zero scale, large scale, negative and very large numbers JSON round-trip tests for decimals and mixed-type partitions	`presto-iceberg/src/test/java/com/facebook/presto/iceberg/TestPartitionData.java`

Possibly linked issues

#Parquet Decimal Metadata Issue: The PR resolves the issue by correctly handling decimal precision and scale during deserialization of partition values.

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey there - I've reviewed your changes - here's some feedback:

In the DECIMAL branch, extract the DecimalType cast and its scale into local variables up front to avoid repeated casts and improve readability.
The fallback uses setScale without a RoundingMode, which will throw if the input has more fractional digits than the target scale—consider specifying an explicit RoundingMode or validating precision to make deserialization behavior clearer.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In the DECIMAL branch, extract the DecimalType cast and its scale into local variables up front to avoid repeated casts and improve readability.
- The fallback uses setScale without a RoundingMode, which will throw if the input has more fractional digits than the target scale—consider specifying an explicit RoundingMode or validating precision to make deserialization behavior clearer.

## Individual Comments

### Comment 1
<location> `presto-iceberg/src/main/java/com/facebook/presto/iceberg/PartitionData.java:197-187` </location>
<code_context>
+                else if (partitionValue.isBigInteger()) {
+                    return new BigDecimal(partitionValue.bigIntegerValue(), ((DecimalType) type).scale());
+                }
+                else {
+                    return partitionValue.decimalValue().setScale(((DecimalType) type).scale());
+                }
         }
</code_context>

<issue_to_address>
**suggestion (bug_risk):** Consider specifying rounding mode when setting scale on BigDecimal.

Omitting the rounding mode can cause ArithmeticException if the value cannot be exactly represented at the desired scale. Specify a rounding mode to prevent this.

Suggested implementation:

```java
                    return partitionValue.decimalValue().setScale(((DecimalType) type).scale(), java.math.RoundingMode.HALF_UP);

```

If `RoundingMode` is not already imported at the top of the file, add:
```java
import java.math.RoundingMode;
```
You may want to choose a different rounding mode depending on your requirements.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-10-06T12:48:58Z

presto-iceberg/src/main/java/com/facebook/presto/iceberg/PartitionData.java

                catch (IOException e) {
                    throw new UncheckedIOException("Failed during JSON conversion of " + partitionValue, e);
                }
            case DECIMAL:


suggestion (bug_risk): Consider specifying rounding mode when setting scale on BigDecimal.

Omitting the rounding mode can cause ArithmeticException if the value cannot be exactly represented at the desired scale. Specify a rounding mode to prevent this.

Suggested implementation:

return partitionValue.decimalValue().setScale(((DecimalType) type).scale(), java.math.RoundingMode.HALF_UP);

If RoundingMode is not already imported at the top of the file, add:

import java.math.RoundingMode;

You may want to choose a different rounding mode depending on your requirements.

PingLiuPing · 2025-10-10T14:41:10Z

@hantangwangd Not sure if you are interested in reviewing this PR, this PR is motivated when reviewing PR #26174 (comment) and I was suggested to move this part and submit a new PR.

PingLiuPing · 2025-10-10T14:43:54Z

@aditi-pandit Can you please take a look. This PR is per your comments in #26174 (comment)

hantangwangd

@PingLiuPing have you ever encountered this problem during real usage? If so, could you also add a similar test case which would fail without this change in IcebergDistributedTestBase or IcebergDistributedSmokeTestBase? Thanks.

PingLiuPing · 2025-10-10T17:43:33Z

@hantangwangd Thanks for the comment.
Yes, these code are needed for Prestissimo iceberg scenario when writing partitioned iceberg table.
When passing partition value (part of the commit message) from velox to Presto coordinator, depends on the actual decimal value we have to cover all the possible conversions.
The logic here is, in Velox decimal is represented as integer (int64/int128). But when we passing the partition value back to Presto, it is encoded in json format, and here when decode the value, based on value it will be treated as different integer types.

PingLiuPing · 2025-10-10T17:45:55Z

@hantangwangd Just want to split this piece of code separately. The purpose of this PR is serving the native iceberg write. I have test cases to cover this from E2E point of view. And I will deliver those testcases once the partition transform related code is merged.
#26182 (comment)

PingLiuPing self-assigned this Oct 6, 2025

PingLiuPing requested review from a team, ZacBlanco and hantangwangd as code owners October 6, 2025 12:47

prestodb-ci added the from:IBM PR from IBM label Oct 6, 2025

prestodb-ci requested review from a team, Mariamalmesfer and namya28 and removed request for a team October 6, 2025 12:48

PingLiuPing requested a review from aditi-pandit October 6, 2025 12:48

sourcery-ai bot reviewed Oct 6, 2025

View reviewed changes

Add more conversion type support for decimal partition value.

8b634d9

PingLiuPing force-pushed the lp_fix_partition_value branch from 34e41e3 to 8b634d9 Compare October 10, 2025 14:38

hantangwangd reviewed Oct 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Support more type conversion for decimal partition value #26240

fix: Support more type conversion for decimal partition value #26240

PingLiuPing commented Oct 6, 2025 •

edited

Loading

Uh oh!

sourcery-ai bot commented Oct 6, 2025 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Uh oh!

sourcery-ai bot Oct 6, 2025

Uh oh!

PingLiuPing commented Oct 10, 2025 •

edited

Loading

Uh oh!

PingLiuPing commented Oct 10, 2025

Uh oh!

hantangwangd left a comment

Uh oh!

PingLiuPing commented Oct 10, 2025

Uh oh!

PingLiuPing commented Oct 10, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix: Support more type conversion for decimal partition value #26240

Are you sure you want to change the base?

fix: Support more type conversion for decimal partition value #26240

Conversation

PingLiuPing commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Impact

Test Plan

Contributor checklist

Release Notes

Uh oh!

sourcery-ai bot commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Flow diagram for DECIMAL partition value deserialization logic

File-Level Changes

Possibly linked issues

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

PingLiuPing commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PingLiuPing commented Oct 10, 2025

Uh oh!

hantangwangd left a comment

Choose a reason for hiding this comment

Uh oh!

PingLiuPing commented Oct 10, 2025

Uh oh!

PingLiuPing commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

PingLiuPing commented Oct 6, 2025 •

edited

Loading

sourcery-ai bot commented Oct 6, 2025 •

edited

Loading

PingLiuPing commented Oct 10, 2025 •

edited

Loading

PingLiuPing commented Oct 10, 2025 •

edited

Loading