Skip to content

Runner stuck in "Running" state after job failure, jobs are stuck in Queued #4203

@strowk

Description

@strowk

Checks

Controller Version

0.12.0

Deployment Method

Helm

Checks

  • This isn't a question or user support case (For Q&A and community support, go to Discussions).
  • I've read the Changelog before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes

To Reproduce

1. Run new runners for awhile
2. Get one of jobs to fail (possibly many times till it reproduces)
3. See that ephemeral runner is stuck in Running and notice that other jobs are stuck in queued

Describe the bug

Github job has failed, after which runner pod was removed (as expected), but ephemeral runner custom resource got stuck in Running and now other jobs seem to not be able to start.

I noticed that something was off because one of our jobs was queued for several hours. I suspect that it should have something to do with that one of ephemeral runners is showing Running, while no pods are present (it is in this state for several hours), though I am not entirely sure about connection between these things. I'd assume that if one runner cannot run the job, then another one should be created, but controller does not seem to be doing this, even though we had enough space (it is configured to create no more than 4 runner pods and there was only one stuck runner).

Note: I did also see #4148 , but I am not sure if this is exactly the same issue or not, because in that other issue there seems to be different trigger - node drain, which I don't believe happened in the case I have observed, where trigger was job failure.

Describe the expected behavior

Ephemeral runner should always be removed after job is done, or else be reused by other jobs. Jobs should not get stuck in queued for several hours when there is processing capacity.

Additional Context

I have tried deleting ephemeral runner that got into this stuck state, after which a new runner was created immediately and pod was scheduled as well, which then have started running one jobs that was stuck for about an hour.

Controller Logs

Here are logs that relate to the runner that go stuck (last to first order).
Most logs are from controller, but two are from listener, they are separated.


2025-08-11T09:26:24Z	INFO	EphemeralRunner	Backing off the next reconciliation due to failure	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}, "lastFailure": "2025-08-11 09:26:22 +0000 UTC", "nextReconciliation": "2025-08-11T09:26:27Z", "requeueAfter": "2.256639026s"}
2025-08-11T09:26:24Z	INFO	EphemeralRunner	Backing off the next reconciliation due to failure	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}, "lastFailure": "2025-08-11 09:26:22 +0000 UTC", "nextReconciliation": "2025-08-11T09:26:27Z", "requeueAfter": "2.264400303s"}
2025-08-11T09:26:23Z	INFO	EphemeralRunner	Backing off the next reconciliation due to failure	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}, "lastFailure": "2025-08-11 09:26:22 +0000 UTC", "nextReconciliation": "2025-08-11T09:26:27Z", "requeueAfter": "3.241068051s"}
2025-08-11T09:26:23Z	INFO	EphemeralRunner	Backing off the next reconciliation due to failure	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}, "lastFailure": "2025-08-11 09:26:22 +0000 UTC", "nextReconciliation": "2025-08-11T09:26:27Z", "requeueAfter": "3.265817726s"}

2025-08-11T09:26:23Z	INFO	listener-app.listener	Job completed message received.	{"RequestId": 0, "Result": "failed", "RunnerId": 10634, "RunnerName": "linux-arm64-selfhosted-g6pbp-runner-krkdv"}

2025-08-11T09:26:22Z	INFO	EphemeralRunner	Backing off the next reconciliation due to failure	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}, "lastFailure": "2025-08-11 09:26:22 +0000 UTC", "nextReconciliation": "2025-08-11T09:26:27Z", "requeueAfter": "4.124344142s"}
2025-08-11T09:26:22Z	INFO	EphemeralRunner	Backing off the next reconciliation due to failure	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}, "lastFailure": "2025-08-11 09:26:22 +0000 UTC", "nextReconciliation": "2025-08-11T09:26:27Z", "requeueAfter": "4.126298274s"}
2025-08-11T09:26:22Z	INFO	EphemeralRunner	EphemeralRunner pod is deleted and status is updated with failure count	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:26:22Z	INFO	EphemeralRunner	Updating ephemeral runner status to track the failure count	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:26:22Z	INFO	EphemeralRunner	Deleting the ephemeral runner pod	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}, "podId": "75653cad-e428-4e8b-bdcb-e6fe3f2d618a"}
2025-08-11T09:26:22Z	INFO	EphemeralRunner	Ephemeral runner pod has finished, but the runner still exists in the service. Deleting the pod to restart it.	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:26:22Z	INFO	EphemeralRunner	Runner exists in GitHub service	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}, "runnerId": 10634}
2025-08-11T09:26:21Z	INFO	EphemeralRunner	Checking if runner exists in GitHub service	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}, "runnerId": 10634}

2025-08-11T09:26:13Z	INFO	listener-app.worker.kubernetesworker	Updating job info for the runner	{"runnerName": "linux-arm64-selfhosted-g6pbp-runner-krkdv", "ownerName": "redacted", "repoName": "redacted-redacted-service", "workflowRef": "redacted/redacted-redacted-service/.github/workflows/build.yaml@refs/heads/main", "workflowRunId": 16876024243, "jobDisplayName": "build", "requestId": 0}

2025-08-11T09:26:13Z	INFO	EphemeralRunner	Ephemeral runner container is still running	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:25:43Z	INFO	EphemeralRunner	Ephemeral runner container is still running	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:25:43Z	INFO	EphemeralRunner	Updated ephemeral runner status	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:25:43Z	INFO	EphemeralRunner	Updating ephemeral runner status	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}, "statusPhase": "Running", "statusReason": "", "statusMessage": "", "ready": true}
2025-08-11T09:25:43Z	INFO	EphemeralRunner	Ephemeral runner container is still running	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:25:42Z	INFO	EphemeralRunner	Ephemeral runner container is still running	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:25:42Z	INFO	EphemeralRunner	Ephemeral runner container is still running	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:25:38Z	INFO	EphemeralRunner	Ephemeral runner container is still running	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:25:31Z	INFO	EphemeralRunner	Ephemeral runner container is still running	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:25:25Z	INFO	EphemeralRunner	Ephemeral runner container is still running	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:24:30Z	INFO	EphemeralRunner	Ephemeral runner container is still running	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:24:30Z	INFO	EphemeralRunner	Updated ephemeral runner status	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:24:30Z	INFO	EphemeralRunner	Updating ephemeral runner status	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}, "statusPhase": "Pending", "statusReason": "", "statusMessage": "", "ready": false}
2025-08-11T09:24:30Z	INFO	EphemeralRunner	Ephemeral runner container is still running	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:24:30Z	INFO	EphemeralRunner	Waiting for runner container status to be available	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:23:43Z	INFO	EphemeralRunner	Waiting for runner container status to be available	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:23:43Z	INFO	EphemeralRunner	Waiting for runner container status to be available	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:23:43Z	INFO	EphemeralRunner	Created ephemeral runner pod	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}, "runnerScaleSetId": 2, "runnerName": "linux-arm64-selfhosted-g6pbp-runner-krkdv", "runnerId": 10634, "configUrl": "https://github.com/redacted", "podName": "linux-arm64-selfhosted-g6pbp-runner-krkdv"}
2025-08-11T09:23:43Z	INFO	EphemeralRunner	Created new pod spec for ephemeral runner	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:23:43Z	INFO	EphemeralRunner	Creating new pod for ephemeral runner	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:23:43Z	INFO	EphemeralRunner	Creating new EphemeralRunner pod.	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:23:43Z	INFO	EphemeralRunner	Created ephemeral runner secret	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}, "secretName": "linux-arm64-selfhosted-g6pbp-runner-krkdv"}
2025-08-11T09:23:43Z	INFO	EphemeralRunner	Created new secret spec for ephemeral runner	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:23:43Z	INFO	EphemeralRunner	Creating new secret for ephemeral runner	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:23:43Z	INFO	EphemeralRunner	Creating new ephemeral runner secret for jitconfig.	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:23:43Z	INFO	EphemeralRunner	Updated ephemeral runner status with runnerId and runnerJITConfig	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:23:43Z	INFO	EphemeralRunner	Updating ephemeral runner status with runnerId and runnerJITConfig	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:23:43Z	INFO	EphemeralRunner	Created ephemeral runner JIT config	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}, "runnerId": 10634}
2025-08-11T09:23:42Z	INFO	EphemeralRunner	Creating ephemeral runner JIT config	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:23:42Z	INFO	EphemeralRunner	Creating new ephemeral runner registration and updating status with runner config	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:23:42Z	INFO	EphemeralRunner	Successfully added runner registration finalizer	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:23:42Z	INFO	EphemeralRunner	Adding runner registration finalizer	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:23:42Z	INFO	EphemeralRunner	Successfully added finalizer	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}
2025-08-11T09:23:42Z	INFO	EphemeralRunnerSet	Created new ephemeral runner	{"version": "0.12.0", "ephemeralrunnerset": {"name":"linux-arm64-selfhosted-g6pbp","namespace":"github-arc-private-runner"}, "runner": "linux-arm64-selfhosted-g6pbp-runner-krkdv"}
2025-08-11T09:23:42Z	INFO	EphemeralRunner	Adding finalizer	{"version": "0.12.0", "ephemeralrunner": {"name":"linux-arm64-selfhosted-g6pbp-runner-krkdv","namespace":"github-arc-private-runner"}}

Runner Pod Logs

These were sadly not captured in k8s

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggha-runner-scale-setRelated to the gha-runner-scale-set mode

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions