chore: extract comparison into separate tool #2632

comphead · 2025-10-22T22:07:13Z

Which issue does this PR close?

Rationale for this change

Extract comparison to separate tool to run against already generated Comet and Spark results.
Added schema comparison and fixed minor bugs in the runner

What changes are included in this PR?

How are these changes tested?

comphead · 2025-10-22T22:11:16Z

fuzz-testing/src/main/scala/org/apache/comet/fuzz/QueryRunner.scala

      case (a: Array[_], b: Array[_]) =>
        a.length == b.length && a.zip(b).forall(x => same(x._1, x._2))
-      case (a: WrappedArray[_], b: WrappedArray[_]) =>
+      case (a: mutable.WrappedArray[_], b: mutable.WrappedArray[_]) =>


moved it from #2614

codecov-commenter · 2025-10-22T22:28:41Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 59.21%. Comparing base (f09f8af) to head (b811583).
⚠️ Report is 644 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #2632      +/-   ##
============================================
+ Coverage     56.12%   59.21%   +3.08%     
- Complexity      976     1449     +473     
============================================
  Files           119      147      +28     
  Lines         11743    13755    +2012     
  Branches       2251     2365     +114     
============================================
+ Hits           6591     8145    +1554     
- Misses         4012     4387     +375     
- Partials       1140     1223      +83

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

andygrove · 2025-10-23T14:09:31Z

I don't think that we should have a combined fuzz-testing-and-tpc-benchmark tool. They serve quite different purposes. I think it would be better to move the DataFrame comparison logic into a shared class somewhere and then update our benchmarking tool to be able to use it.

This probably means that we need to convert our benchmark script from Python to Scala.

andygrove · 2025-10-23T14:10:39Z

I don't think that we should have a combined fuzz-testing-and-tpc-benchmark tool. They serve quite different purposes. I think it would be better to move the DataFrame comparison logic into a shared class somewhere and then update our benchmarking tool to be able to use it.

This probably means that we need to convert our benchmark script from Python to Scala.

Another option would be to update the existing Python benchmark script to save query results to Parquet, and then implement a command-line tool for comparing the Parquet files produced from the Spark and Comet runs.

andygrove · 2025-10-23T15:34:47Z

I don't think that we should have a combined fuzz-testing-and-tpc-benchmark tool. They serve quite different purposes. I think it would be better to move the DataFrame comparison logic into a shared class somewhere and then update our benchmarking tool to be able to use it.
This probably means that we need to convert our benchmark script from Python to Scala.

Another option would be to update the existing Python benchmark script to save query results to Parquet, and then implement a command-line tool for comparing the Parquet files produced from the Spark and Comet runs.

I created #2640 to add a new option to the benchmark script, to write query results to Parquet.

comphead · 2025-10-23T16:33:51Z

I don't think that we should have a combined fuzz-testing-and-tpc-benchmark tool. They serve quite different purposes. I think it would be better to move the DataFrame comparison logic into a shared class somewhere and then update our benchmarking tool to be able to use it.
This probably means that we need to convert our benchmark script from Python to Scala.

Another option would be to update the existing Python benchmark script to save query results to Parquet, and then implement a command-line tool for comparing the Parquet files produced from the Spark and Comet runs.

Right, this option looks better IMO so we can have a command line utility similar to fuzzer and reuse comparison logic. We still need this PR in some way as it has some refactoring to reuse comparison

comphead · 2025-10-27T20:57:26Z

dev/benchmarks/tpcbench.py

-                            output_path = f"{write_path}/q{query}"
-                            df.coalesce(1).write.mode("overwrite").parquet(output_path)
-                            print(f"Query {query} results written to {output_path}")
+                            if len(df.columns) > 0:


spark complains on saving df with empty schema, this can happen for DDL statements which came across in TPC sets

andygrove · 2025-10-28T19:46:43Z

fuzz-testing/src/main/scala/org/apache/comet/fuzz/ComparisonTool.scala

+  verify()
+}
+
+object ComparisonToolMain {


Could this just be named ComparisonTool?

andygrove · 2025-10-28T19:52:52Z

fuzz-testing/src/main/scala/org/apache/comet/fuzz/ComparisonTool.scala

+            // Read Comet parquet files
+            val cometDf = spark.read.parquet(cometSubfolderPath.getAbsolutePath)
+            val cometRows = cometDf.collect()
+            val cometPlan = cometDf.queryExecution.executedPlan.toString


I'm not sure why we need to do anything with the plans for reading the Parquet files. Shouldn't we just be comparing the data in the Parquet files?

comparison has nothing to do with plans it is true. Plans needed to be displayed when assertion happening down the road. Let me think if I can get rid of it

andygrove

LGTM. Thanks @comphead

comphead requested a review from andygrove October 22, 2025 22:07

comphead commented Oct 22, 2025

View reviewed changes

comphead marked this pull request as draft October 23, 2025 17:17

comphead changed the title ~~chore: add TPC queries to be run by fuzzer correctness checker~~ chore: extract comparison into separate tool Oct 23, 2025

comphead added 5 commits October 27, 2025 09:37

chore: add TPC queries to be run by fuzzer correctness checker

2470980

chore: extract comparison tool from fuzzer

3aea54e

chore: extract comparison tool from fuzzer

ced0c0d

chore: extract comparison tool from fuzzer

3631b54

chore: extract comparison tool from fuzzer

f381c3d

comphead force-pushed the update_fuzzer branch from 9cee835 to f381c3d Compare October 27, 2025 17:05

chore: extract comparison tool from fuzzer

720d77e

comphead marked this pull request as ready for review October 27, 2025 20:51

comphead commented Oct 27, 2025

View reviewed changes

comphead added 2 commits October 28, 2025 10:24

chore: extract comparison tool from fuzzer

ce2a5b2

chore: extract comparison tool from fuzzer

6d28c0c

andygrove reviewed Oct 28, 2025

View reviewed changes

comphead added 3 commits October 28, 2025 15:00

chore: extract comparison tool from fuzzer

95955be

chore: extract comparison tool from fuzzer

c75ecfc

chore: extract comparison tool from fuzzer

b811583

andygrove approved these changes Oct 29, 2025

View reviewed changes

andygrove merged commit 2ed0967 into apache:main Oct 29, 2025
144 of 145 checks passed

comphead mentioned this pull request Oct 29, 2025

chore: display schema for datasets being compared #2665

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: extract comparison into separate tool #2632

chore: extract comparison into separate tool #2632

comphead commented Oct 22, 2025 •

edited

Loading

Uh oh!

comphead Oct 22, 2025

Uh oh!

codecov-commenter commented Oct 22, 2025 •

edited

Loading

Uh oh!

andygrove commented Oct 23, 2025

Uh oh!

andygrove commented Oct 23, 2025

Uh oh!

andygrove commented Oct 23, 2025

Uh oh!

comphead commented Oct 23, 2025

Uh oh!

comphead Oct 27, 2025

Uh oh!

andygrove Oct 28, 2025

Uh oh!

andygrove Oct 28, 2025

Uh oh!

comphead Oct 28, 2025

Uh oh!

andygrove left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chore: extract comparison into separate tool #2632

chore: extract comparison into separate tool #2632

Conversation

comphead commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

Uh oh!

comphead Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

andygrove commented Oct 23, 2025

Uh oh!

andygrove commented Oct 23, 2025

Uh oh!

andygrove commented Oct 23, 2025

Uh oh!

comphead commented Oct 23, 2025

Uh oh!

comphead Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

andygrove Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

andygrove Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

comphead Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

andygrove left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

comphead commented Oct 22, 2025 •

edited

Loading

codecov-commenter commented Oct 22, 2025 •

edited

Loading