-
-
Notifications
You must be signed in to change notification settings - Fork 738
Open
Labels
bugSomething is brokenSomething is broken
Description
Describe the issue:
A certain order of persist, repartition, and checking for npartitions causes Dask to hang indefinitely.
Minimal Complete Verifiable Example:
import pandas as pd
import dask.dataframe as dd
from dask.distributed import Client
client = Client()
data = {'col1': [1, 2, 3], 'col2': ['A', 'B', 'C']}
df = pd.DataFrame(data)
ddf = dd.from_pandas(df, npartitions=1)
ddf = ddf.persist() # Remove this persist, and the code works fine.
ddf = ddf.repartition(partition_size='250MB')
print(ddf.npartitions)
print(ddf.compute())
Anything else we need to know?:
Environment:
- Dask version: 2025.7.0
- Python version: Reproduced on 3.10 and 3.11.
- Operating System: Reproduced on colab and local machine (MacOSX)
- Install method (conda, pip, source): On Colab, using preinstalled deps. Also reproduced locally where deps are installed using pip.
jonesetc, creste and waqas-anonymco
Metadata
Metadata
Assignees
Labels
bugSomething is brokenSomething is broken