Skip to content

Conversation

@andygrove
Copy link
Member

@andygrove andygrove commented Oct 25, 2025

Which issue does this PR close?

Closes #2626

Rationale for this change

Fall back to Spark by default when sorting on floating-point values. Users can opt in if they don't care about the different handling of +0.0 vs -0.0.

What changes are included in this PR?

  • Refactor serde for SortOrder
  • Add specific tests
  • Write documentation (compatibility and tuning guide)

How are these changes tested?

@codecov-commenter
Copy link

codecov-commenter commented Oct 25, 2025

Codecov Report

❌ Patch coverage is 82.75862% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.20%. Comparing base (f09f8af) to head (2ce637d).
⚠️ Report is 640 commits behind head on main.

Files with missing lines Patch % Lines
.../main/scala/org/apache/comet/serde/CometSort.scala 81.48% 1 Missing and 4 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #2650      +/-   ##
============================================
+ Coverage     56.12%   59.20%   +3.08%     
- Complexity      976     1449     +473     
============================================
  Files           119      147      +28     
  Lines         11743    13755    +2012     
  Branches       2251     2365     +114     
============================================
+ Hits           6591     8144    +1553     
- Misses         4012     4387     +375     
- Partials       1140     1224      +84     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@andygrove andygrove changed the title fix: Make floating-point sort compatible with Spark when handling negative zero fix: Make floating-point sort compatible with Spark when handling negative zero [WIP] Oct 25, 2025
@andygrove andygrove changed the title fix: Make floating-point sort compatible with Spark when handling negative zero [WIP] fix: Mark SortOrder with floating-point as incompatible [WIP] Oct 25, 2025
@andygrove andygrove changed the title fix: Mark SortOrder with floating-point as incompatible [WIP] fix: Mark SortOrder with floating-point as incompatible Oct 26, 2025
@andygrove andygrove marked this pull request as ready for review October 26, 2025 17:50
withSQLConf(CometConf.getExprAllowIncompatConfigKey("SortOrder") -> "false") {
checkSparkAnswerAndFallbackReason(
"select * from tbl order by 1, 2",
"unsupported range partitioning sort order")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I filed a follow-up issue for improving the fallback reason to show the root cause

#2651


override def getSupportLevel(expr: SortOrder): SupportLevel = {

def containsFloatingPoint(dt: DataType): Boolean = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def containsFloatingPoint(dt: DataType): Boolean = {
def dataTypeContainsFloatingPoint(dt: DataType): Boolean = {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method takes a DataType parameter, so it seems redundant to include that in the method name as well?

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andygrove it is lgtm, just a small suggestion on naming

Copy link
Contributor

@mbutrovich mbutrovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll probably continue to grumble about what "incompatible" means but this LGTM. Thanks @andygrove!

@mbutrovich mbutrovich merged commit 0b33b05 into apache:main Oct 28, 2025
102 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sorting on floating point with negative zero incompatible with Spark

4 participants