-
Notifications
You must be signed in to change notification settings - Fork 39
Description
Hello, I have found that the current auto scaling mechanism is not taking into account whether a repo is registered to the runner group specified by an org pool.
Currently we have many repos that are not registered under the runner group specified for pools, when any of those repos have a queued up workflow job it triggers the auto scale because they are in the same org and the tags match.
Unfortunately most of our existing workflows are set to self-hosted tag so they are triggering autoscaling of pools that should not be (since they aren't in the runner group they wont be picked up). This is causing all of our pools to constantly pin runner amount to max and recreate runners non stop (cleaned up by scaleDown loop).
As a temporary workaround I have created a fork of garm and have implemented the following:
- Added a runner group map with a selected repos "cache" list, in memory, inside of the organization object.
type organization struct {
...
rgs map[string]*common.GithubRunnerGroup
}
type GithubRunnerGroup struct {
ID *int64
Name *string
Visibility *string
SelectedRepositories *github.ListRepositories
}
- On org pool manager startup, schedule job to periodically (10 mins) refresh selected repos list for every enabled pool.
- Loops over all pools and generates
common.GithubRunnerGroup
object with runner group and selected repos info from ghcli.
- Loops over all pools and generates
- On HandleWorkflowJob verify repo is in runner group's selected repos list
- Loops over potential pools (reduced by tags)
- On public or private visibility, return true
- On selected visibility, run logic
- Loops over potential pools (reduced by tags)
- Add api interface to execute selected-repo list update on a specific pool list
- Will trigger this programatically when a repo is registered to a runner group