Skip to content

Semantic mismatch in SpecCluster.requested #9103

@alisterburt

Description

@alisterburt

related to #9102

SpecCluster.requested doesn't match what AdaptiveCore expects:

  • AdaptiveCore expects: "workers we've asked for but haven't arrived yet"
  • SpecCluster provides: "all workers in our spec, expanded by groups"

This mismatch exists because SpecCluster uses self.workers as a proxy for "requested". For non-grouped workers these are 1:1. For grouped workers, a single worker in self.workers is a group of multiple worker processes.

potential fix

We could make SpecCluster.requested more accurately represent "workers we've asked for that the scheduler knows about"

@property
def requested(self):
    out = set()
    scheduler_workers = {d["name"] for d in self.scheduler_info.get("workers", {}).values()}
    
    for name in self.workers:
        try:
            spec = self.worker_spec[name]
        except KeyError:
            continue
            
        if "group" in spec:
            # Only count workers that actually exist
            out.update({
                str(name) + suffix 
                for suffix in spec["group"]
                if str(name) + suffix in scheduler_workers
            })
        else:
            if name in scheduler_workers:
                out.add(name)
    return out

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs infoNeeds further information from the user

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions