Skip to content

Dask hangs indefinitely when repartioning after perist. #9101

@kjleftin

Description

@kjleftin

Describe the issue:

A certain order of persist, repartition, and checking for npartitions causes Dask to hang indefinitely.

Minimal Complete Verifiable Example:

import pandas as pd
import dask.dataframe as dd
from dask.distributed import Client

client = Client()

data = {'col1': [1, 2, 3], 'col2': ['A', 'B', 'C']}
df = pd.DataFrame(data)
ddf = dd.from_pandas(df, npartitions=1)

ddf = ddf.persist() # Remove this persist, and the code works fine.
ddf = ddf.repartition(partition_size='250MB') 
print(ddf.npartitions)
print(ddf.compute())

Anything else we need to know?:

Environment:

  • Dask version: 2025.7.0
  • Python version: Reproduced on 3.10 and 3.11.
  • Operating System: Reproduced on colab and local machine (MacOSX)
  • Install method (conda, pip, source): On Colab, using preinstalled deps. Also reproduced locally where deps are installed using pip.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething is broken

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions