Skip to content

Conversation

cgivre
Copy link
Contributor

@cgivre cgivre commented Oct 1, 2025

DRILL-8537: Bump Calcite to Version 1.35

Description

I am attempting a new approach and instead of bumping Calcite from 1.34 -> 1.40, I'm going to try this one version at a time and see how far we get.

Significant Changes in Calcite 1.35

  • Calcite 1.35 adds a literal_agg function which allows literals in aggregate queries.
  • Fix bugs in the ElasticSearch plugin which prevented certain configuration parameters from being passed to ElasticSearch.
  • Fix bugs introduced in Calcite 1.34 relating to certain date arithmetic functions
  • Update issues in Drill relating to Calcite's new VARDECIMAL handling.

Documentation

No user facing changes.

Testing

Ran existing unit tests

@cgivre cgivre self-assigned this Oct 1, 2025
@cgivre cgivre marked this pull request as draft October 1, 2025 13:45
@cgivre cgivre added code-cleanup dependencies calcite-update Changes required for updating the Calcite version backport-to-stable This bug fix is applicable to the latest stable release and should be considered for inclusion there labels Oct 1, 2025
@cgivre
Copy link
Contributor Author

cgivre commented Oct 6, 2025

Major Changes

1. Function Type Inference

EXTRACT Function

Problem: EXTRACT(SECOND) was returning BIGINT instead of DOUBLE, losing fractional seconds
Solution:

  • Created DrillCalciteSqlExtractWrapper with custom type inference
  • Updated DrillConvertletTable.extractConvertlet() to use TypeInferenceUtils.getSqlTypeNameForTimeUnit()
  • Returns DOUBLE for SECOND, BIGINT for other time units

Files Modified:

  • DrillCalciteSqlExtractWrapper.java (new)
  • DrillConvertletTable.java
  • DrillOperatorTable.java
  • TestFunctionsWithTypeExpoQueries.java

TIMESTAMPDIFF Function

Problem: Type mismatch between validation (BIGINT) and conversion (INTEGER)
Solution:

  • Created DrillCalciteSqlTimestampDiffWrapper for consistent BIGINT return type
  • Updated DrillConvertletTable.timestampDiffConvertlet() to use BIGINT
  • Both validation and conversion now consistently return BIGINT

Files Modified:

  • DrillCalciteSqlTimestampDiffWrapper.java (new)
  • DrillConvertletTable.java
  • DrillOperatorTable.java

TIMESTAMPADD Function

Problem: Calcite 1.35 was adding precision to DATE types, causing assertion errors
Solution:

  • Created DrillCalciteSqlTimestampAddWrapper with proper type logic
  • Only adds precision to TIMESTAMP and TIME types, not DATE
  • Updated DrillConvertletTable.timestampAddConvertlet() to skip precision for DATE

Files Modified:

  • DrillCalciteSqlTimestampAddWrapper.java (new)
  • DrillConvertletTable.java
  • DrillOperatorTable.java

2. Function Registration & Resolution

Vararg Functions (CONCAT, COALESCE, etc.)

Problem: Function resolution failures for functions with variable arguments
Solution: Enhanced LocalFunctionRegistry with sophisticated vararg matching logic

Files Modified:

  • LocalFunctionRegistry.java (+57 lines)

Niladic Special Functions (CURRENT_DATE, SESSION_USER, etc.)

Problem: Special functions not properly recognized in Calcite 1.35
Solution:

  • Created SpecialFunctionRewriter to handle niladic function transformations
  • Added explicit registration in DrillOperatorTable

Files Modified:

  • SpecialFunctionRewriter.java (+84 lines, new)
  • CountFunctionRewriter.java (+52 lines, new)
  • CharToVarcharRewriter.java (+61 lines, new)
  • DrillOperatorTable.java

3. COUNT(*) Handling

Problem: COUNT(*) type inference changed in Calcite 1.35
Solution: Created LiteralAggFunction for proper literal aggregate handling

Files Modified:

  • LiteralAggFunction.java (+192 lines, new)
  • DrillAggregateRel.java
  • AggPrelBase.java

4. Aggregate Cost Estimation

Problem: Deprecated RelOptCost constructor removed
Solution: Updated to use withAggCallCount() builder pattern

Files Modified:

  • DrillAggregateRel.java
  • AggPrelBase.java
  • DrillReduceAggregatesRule.java

5. TIMESTAMPADD Implementation

Problem: Complete signature change in Calcite 1.35
Solution:

  • Reimplemented TIMESTAMPADD function with new interval arithmetic
  • Added template-based code generation

Files Modified:

  • DateIntervalFunc.tdd
  • TimestampAddFunction.java (+203 lines, new)
  • DrillConvertletTable.java

6. Complex Writer Functions

Problem: FLATTEN, CONVERT_FROM, CONVERT_TO require ProjectRecordBatch context
Solution:

  • Modified DrillConstExecutor to skip constant folding for these functions
  • Updated dummy function implementations for Calcite 1.35 compatibility

Files Modified:

  • DrillConstExecutor.java
  • DummyConvertFrom.java
  • DummyConvertTo.java
  • DummyFlatten.java

7. FLATTEN in Aggregates Validation

Problem: FLATTEN only validated in COUNT, allowed in other aggregates
Solution: Changed validation from SqlCountAggFunction to SqlAggFunction to catch ALL aggregate types

Files Modified:

  • UnsupportedOperatorsVisitor.java

8. Error Handling & Validation

Prepared Statement Errors

Problem: Parse errors wrapped differently in RPC layer, appearing as SYSTEM instead of VALIDATION
Solution:

  • Added exception unwrapping in SqlConverter.parse()
  • Updated test expectations to match Calcite 1.35 behavior

Files Modified:

  • SqlConverter.java
  • TestPreparedStatementProvider.java

Invalid CAST Operations

Problem: Calcite 1.35 correctly rejects semantically invalid CAST(DATE as TIME)
Solution: Removed invalid test case with explanatory comment

Files Modified:

  • TestParquetFilterPushDownForDateTimeCasts.java

Test Updates

Core Module Tests

  • TestFunctionsQuery.java - Updated for DOUBLE return from EXTRACT(SECOND)
  • TestFunctionsWithTypeExpoQueries.java - Fixed type expectations (FLOAT8 for SECOND)
  • TestTimestampAddDiffFunctions.java - Updated for new type inference
  • TestCountStar.java - Fixed test infrastructure (changed to PlanTestBase)
  • TestParquetFilterPushDownForDateTimeCasts.java - Removed invalid cast test
  • TestAggregateFunctions.java - Updated aggregate type expectations
  • TestLiteralAggFunction.java (+241 lines, new) - Comprehensive literal aggregate testing

JDBC Storage Plugin Tests

  • TestJdbcPluginWithMySQLIT.java - Updated SQRT to return DOUBLE instead of BigDecimal
  • TestJdbcPluginWithMSSQL.java - Fixed schema expectations (BIGINT for COUNT, REQUIRED types)
  • TestJdbcPluginWithPostgres.java - Fixed schema expectations (BIGINT for COUNT, REQUIRED types)

Key Behavioral Changes

Type Inference

  1. EXTRACT(SECOND) now returns DOUBLE (was BIGINT) - supports fractional seconds
  2. TIMESTAMPDIFF returns BIGINT consistently
  3. COUNT(*) returns BIGINT (was INT in some contexts)
  4. SQRT and math functions consistently return DOUBLE
  5. Literal expressions and aggregates are REQUIRED (not OPTIONAL/nullable)

Function Resolution

  1. Vararg functions (CONCAT, COALESCE) now match multiple signatures
  2. Niladic functions (CURRENT_DATE, etc.) properly transformed with parentheses
  3. Special functions (COUNT(*), FLATTEN, etc.) have dedicated handling

Validation

  1. Stricter type checking between validation and conversion phases
  2. Invalid casts (DATE→TIME) now properly rejected
  3. FLATTEN properly rejected in ALL aggregate functions (not just COUNT)

Files Created

Core Engine

  1. DrillCalciteSqlExtractWrapper.java - Custom EXTRACT type inference
  2. DrillCalciteSqlTimestampAddWrapper.java - Custom TIMESTAMPADD type inference
  3. DrillCalciteSqlTimestampDiffWrapper.java - Custom TIMESTAMPDIFF type inference
  4. SpecialFunctionRewriter.java - Niladic function rewriting
  5. CountFunctionRewriter.java - COUNT(*) rewriting
  6. CharToVarcharRewriter.java - CHAR to VARCHAR conversion
  7. LiteralAggFunction.java - Literal aggregate handling
  8. TimestampAddFunction.java - New TIMESTAMPADD implementation

Tests

  1. TestLiteralAggFunction.java - Literal aggregate testing
  2. TestCountStar.java - COUNT(*) functionality testing

Migration Notes for Developers

If you use EXTRACT(SECOND):

  • Before: Returned BIGINT (lost fractional seconds)
  • After: Returns DOUBLE (preserves fractional seconds like 45.123)
  • Action: Update code expecting integer values to handle doubles

If you use COUNT(*):

  • Before: Could return INT in some contexts
  • After: Consistently returns BIGINT
  • Action: Update code expecting INT to handle BIGINT

If you use SQRT or math functions:

  • Before: Might return DECIMAL/BigDecimal in some contexts
  • After: Consistently returns DOUBLE
  • Action: Update type expectations in tests/code

If you cast DATE to TIME:

  • Before: Allowed (but semantically meaningless)
  • After: Properly rejected with validation error
  • Action: Remove invalid casts, use proper conversions

Compatibility

Backward Compatibility

  • Query Results: Mostly compatible, but type changes may affect downstream applications
  • API: Fully compatible
  • Storage Format: Fully compatible

Breaking Changes

  1. EXTRACT(SECOND) return type changed from BIGINT → DOUBLE
  2. COUNT(*) return type changed from INT → BIGINT in some contexts
  3. Math function return types more strictly DOUBLE (not DECIMAL)
  4. Invalid DATE→TIME casts now rejected

@cgivre cgivre changed the title DRILL-XXXX: Bump Calcite to Version 1.35 DRILL-8537: Bump Calcite to Version 1.35 Oct 9, 2025
@cgivre cgivre marked this pull request as ready for review October 9, 2025 02:38
@cgivre cgivre linked an issue Oct 9, 2025 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-to-stable This bug fix is applicable to the latest stable release and should be considered for inclusion there calcite-update Changes required for updating the Calcite version code-cleanup dependencies

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Elasticsearch storage plugin error on connect to elastic cloud

1 participant