Skip to content

[EPIC] Complete DML Support (MERGE, INSERT OVERWRITE, TRUNCATE) #19617

@ethan-tyler

Description

@ethan-tyler

Is your feature request related to a problem or challenge?

With #19142 merged, DataFusion has TableProvider hooks for DELETE and UPDATE. This epic tracks the remaining DML statements needed for a complete SQL surface.

Currently missing:

  • MERGE INTO - downstream projects have to decompose this themselves
  • INSERT OVERWRITE - important for partition replacement workflows
  • TRUNCATE TABLE - currently requires DELETE WHERE true

Describe the solution you'd like

MERGE statement:

  • sqlparser-rs already parses MERGE (Statement::Merge)
  • Add LogicalPlan::Merge node
  • Physical planning via decomposition to existing DELETE/UPDATE/INSERT hooks
  • TableProvider::merge_into() for storage-native implementations

INSERT OVERWRITE:

  • Extend insert_into() or add separate hook
  • Support static and dynamic partition overwrite modes

TRUNCATE:

  • Add TableProvider::truncate() hook
  • More efficient than full table scan + delete

ON CONFLICT (nice-to-have):

  • PostgreSQL-style INSERT ... ON CONFLICT DO UPDATE/NOTHING
  • Lower priority- rounds out the DML surface

Describe alternatives you've considered

Downstream projects can continue implementing MERGE decomposition themselves. Works, but duplicates planning logic across integrations and pushes SQL semantics outside the query engine.

Additional context

Related issues for context:

Happy to take this on if there's interest!

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions