Bug in Coscheduling plugin

### Area

- [x] Scheduler
- [ ] Controller
- [ ] Helm Chart
- [ ] Documents

### Other components

_No response_

### What happened?

The Coscheduling plugin has a code path where the group gets permitted, even if only one pod in the group can schedule with no other changes on the cluster or incoming pods.
This one pod then gets a node assigned, while the scheduling of the second pod gets caught in the post filter of the Coscheduling plugin and then rejected. We end up with a partially scheduled podgroup, creating fragmentation.
This is some race condition as it doesn't always happen.

During debugging I saw that the first pod of the podgroup entered the `waiting` state via the `permit` function of the Coscheduling plugin as it should, because we have one free node after all. But then when the second pod in the group comes through it enters the `success` case of the `permit` function, only to find out in the next scheduling loop that there wasn't actually a second node free. But at this point the first pod has already been admitted.

### What did you expect to happen?

We would expect the incoming podgroup in the reproduction steps below to be pending until the single pod gets removed.

### How can we reproduce it (as minimally and precisely as possible)?

Have exactly two nodes on which some workload can be scheduled on (for instance via taints).
Create a single pod on one of these nodes using the `default-scheduler`, requesting all resources on this node.
Create a podgroup of 2 with the `scheduler-plugins-scheduler` profile, with the same resource requests. We would expect this group to be pending, because the single initial pod already takes up one node.

### Anything else we need to know?

_No response_

### Kubernetes version

<details>

```console
$ kubectl version
Client Version: v1.32.3
Kustomize Version: v5.5.0
Server Version: v1.32.7
```

</details>


### Scheduler Plugins version

Commit 2fd0b94 (current master)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bug in Coscheduling plugin #930

Area

Other components

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Scheduler Plugins version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug in Coscheduling plugin #930

Description

Area

Other components

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Scheduler Plugins version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions