-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Add some race-condition protection to VPA recommender #8320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: adrianmoisey The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@@ -125,7 +125,7 @@ This document is auto-generated from the flag definitions in the VPA recommender | |||
| `storage` | string | | Specifies storage mode. Supported values: prometheus, checkpoint | | |||
| `target-cpu-percentile` | float | 0.9 | CPU usage percentile that will be used as a base for CPU target recommendation. Doesn't affect CPU lower bound, CPU upper bound nor memory recommendations. | | |||
| `target-memory-percentile` | float | 0.9 | Memory usage percentile that will be used as a base for memory target recommendation. Doesn't affect memory lower bound nor memory upper bound. | | |||
| `update-worker-count` | | 10 | kube-api-qps Number of concurrent workers to update VPA recommendations and checkpoints. When increasing this setting, make sure the client-side rate limits (kube-api-qps and `kube-api-burst`) are either increased or turned off as well. Determines the minimum number of VPA checkpoints written per recommender loop. | | |||
| `update-worker-count` | int | 10 | Number of concurrent workers to update VPA recommendations and checkpoints. When increasing this setting, make sure the client-side rate limits ('kube-api-qps' and 'kube-api-burst') are either increased or turned off as well. Determines the minimum number of VPA checkpoints written per recommender loop. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I happen to notice this weirdness in the --update-worker-count
due to the back-ticks, so I fixed them while I was here.
Reference: https://pkg.go.dev/flag#PrintDefaults
The listed type, here int, can be changed by placing a back-quoted name in the flag's usage string; the first such item in the message is taken to be a parameter name to show in the message and the back quotes are stripped from the message when displayed
oc.mutex.Lock() | ||
oc.cnt[key]++ | ||
oc.mutex.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at changing this to use atomic.AddInt64()
.
It works, but my concern is that the map need to gaurentee that all keys exist in the map. This is possible by pre-filling the map with all possible combinations. However, we need to remember to update this file each time a new type is added. Ie:
autoscaler/vertical-pod-autoscaler/pkg/utils/metrics/recommender/recommender.go
Lines 41 to 46 in 9bc4220
modes = []string{ | |
string(vpa_types.UpdateModeOff), | |
string(vpa_types.UpdateModeInitial), | |
string(vpa_types.UpdateModeRecreate), | |
string(vpa_types.UpdateModeAuto), | |
} |
The problem is that there doesn't seem to be a good way to dynamically generate that list in the VPA.
What we may be able to do is write a generate script that will keep this file up to date, which we can run in CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also looked at sync.Map
. I ran a benchmark of the test I wrote and changing to a sync.Map
doesn't seem to change the overall benchmark's speed much
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
atomic.AddInt64()
should be faster right?
We can open a follow-up issue for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, it will be faster.
I can make a follow-up issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a glance, the way you have it now seems ok to me!
Also move it to a private field, so that reads/writes are all via the protected methods
From https://pkg.go.dev/flag#PrintDefaults > The listed type, here int, can be changed by placing a back-quoted > name in the flag's usage string; the first such item in the message > is taken to be a parameter name to show in the message and the back > quotes are stripped from the message when displayed.
/unhold I've added a test which seems to catch the race condition without the fixes |
Looks good to me. I would love to hear others thoughts on this |
IsUnderVPA bool | ||
UpdateMode *vpa_types.UpdateMode | ||
ScalingMode *vpa_types.ContainerScalingMode | ||
ControlledResources *[]ResourceName | ||
|
||
mutex sync.RWMutex | ||
} | ||
|
||
// GetLastRecommendation returns last recorded recommendation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also mention thread safety here.
@@ -464,6 +468,8 @@ func (cluster *clusterState) getContributiveAggregateStateKeys(ctx context.Conte | |||
// keep track of empty recommendations and log information about them | |||
// periodically. | |||
func (cluster *clusterState) RecordRecommendation(vpa *Vpa, now time.Time) error { | |||
cluster.mutex.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also need read locks throughout this file as well don't we?
To make things easier we could consider using the sync.Map
type for all the problematic maps to just have all that handled for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also need read locks throughout this file as well don't we?
Doesn't seem so, the race condition happens in the map cluster.emptyVPAs
. The is only accessed from this method and DeleteVpa()
.
I'll add a mutex to DeleteVpa()
and leave a TODO to change to a sync.Map
} | ||
close(vpaUpdates) | ||
|
||
wg.Wait() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to validate the number of VPA updates processed here?
You could have an atomic counter in here just counting how many updates have been processed.
oc.mutex.Lock() | ||
oc.cnt[key]++ | ||
oc.mutex.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a glance, the way you have it now seems ok to me!
oc.cnt[key]++ | ||
oc.mutex.Unlock() | ||
} | ||
|
||
// Observe passes all the computed bucket values to metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to lock down here when reading oc.cnt
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 a read lock? I guess so.
@@ -147,7 +159,7 @@ func (a *AggregateContainerState) GetControlledResources() []ResourceName { | |||
// a VPA object. | |||
func (a *AggregateContainerState) MarkNotAutoscaled() { | |||
a.IsUnderVPA = false | |||
a.LastRecommendation = nil | |||
a.lastRecommendation = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically we should also protect this with the mutex as well since we are mutating the value.
And add a read lock when reading
cc @raywainman |
/lgtm |
What type of PR is this?
/kind bug
What this PR does / why we need it:
#7992 introduces goroutines in order to process a batch of VPAs quicker. This inadvertently lead to some race conditions that we're picked up in our tests.
This PR attempts to fix those race conditions
Which issue(s) this PR fixes:
Fixes #8318
Special notes for your reviewer:
Opening this PR early, I still plan to write some tests. I'd like to make a breaking-test that these fixes solve.
/hold
Does this PR introduce a user-facing change?