Skip to content

Only compute bounds/ dynamic filters if consumer asks for it #17527

@LiaCastaneda

Description

@LiaCastaneda

Describe the bug

Just a follow up to this comment.

Currently, DataFusion computes bounds for all queries that contain a HashJoinExec node whenever the option enable_dynamic_filter_pushdown is set to true (default). It might make sense to compute these bounds only when we explicitly know there is a consumer that will use them.

One way to achieve this could be during physical planning: while traversing the plan, check whether there is any scan/leaf node that is “interested in” or supports dynamic filters (determined by gather_filters_for_pushdown). This might just require adding some logic to the filter pushdown optimization rule itself I think?

Then, only if there is at least one interested consumer that accepts the DynamicFilterPhysicalExpr, set a flag on HashJoinExec to build the bounds accumulator, otherwise, skip bounds computation entirely.

To Reproduce

No response

Expected behavior

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions