-
Notifications
You must be signed in to change notification settings - Fork 17
Open
Description
Currently in dask-deltatable, we're using pyarrow.dataset.dataset
, which we filter with a pyarrow.Expression
:
dask-deltatable/dask_deltatable/core.py
Line 78 in dbeb8cc
.to_table(filter=filter_expression, columns=self.columns) |
Would the ParquetDataset
be more appropriate here? It can accept filters as Expression
, or tuple
/DNF form, which would allow us to skip that filters_to_expression
step.
https://arrow.apache.org/docs/python/generated/pyarrow.parquet.ParquetDataset.html
Metadata
Metadata
Assignees
Labels
No labels