Skip to content

KEP-5237: watch-based route controller reconciliation using informers #5289

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

lukasmetzner
Copy link

  • One-line PR description: Introduce a feature gate to enable informer-based reconciliation in the routes' controller of cloud-controller-manager, reducing API calls and improving efficiency.
  • Other comments:

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory labels May 8, 2025
@k8s-ci-robot
Copy link
Contributor

Welcome @lukasmetzner!

It looks like this is your first PR to kubernetes/enhancements 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/enhancements has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. label May 8, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @lukasmetzner. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 8, 2025
@lukasmetzner
Copy link
Author

/cc @elmiko @JoelSpeed

@k8s-ci-robot k8s-ci-robot requested review from elmiko and JoelSpeed May 8, 2025 08:10
@apricote
Copy link
Member

apricote commented May 8, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 8, 2025
Copy link
Contributor

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the core concepts make sense to me, i think we should clean up the language around the term "cloud provider". in some places we use it to mean the controllers (ie the ccm), and other places we use it to mean the infrastructure provider (eg aws, azure, gcp), and we also refer to the framework as well.

we also need to make some decisions about the open questions. i wonder if we should go over these questions at the next sig meeting?

@lukasmetzner
Copy link
Author

@elmiko If possible, I’d love to work through the open questions here in the PR so we can keep things moving, and then have a discussion in the next SIG meeting. Otherwise, we might end up losing a two of weeks of time. What do you think?

Open Questions:

  • What should be the default frequency for the periodic full reconciliation?
    • Input from @JoelSpeed: We should do it similar to other controllers and choose a random time between 12h and 24h.
    • As we use the same shared informers' factory as in other controllers, this should already be implemented.
  • Are there other Node fields besides node.status.addresses and PodCIDRs that should trigger a route update?
  • How should we set the interval for the periodic reconcile? Options:
    • Adjust --route-reconcile-period when feature gate enabled
    • Use --min-resync-period; currently defaults to 12h
    • Introduce a new flag
    • If we use the 12h-24h option we can probably reuse --min-resync-period.

@elmiko
Copy link
Contributor

elmiko commented May 9, 2025

i'm fine to continue the discussions here.

Input from @JoelSpeed: We should do it similar to other controllers and choose a random time between 12h and 24h.

i think 12h sounds fine to me.

Are there other Node fields besides node.status.addresses and PodCIDRs that should trigger a route update?

i'll have to think about this a little more, i have a feeling that those are good to start with.

How should we set the interval for the periodic reconcile?

i like the idea of adjusting the default for the --route-reconcile-period, but i don't want users to get confused about this.

my only issue with using --min-resync-period is that it sounds much more general and we are just focusing on the route controller.

@lukasmetzner
Copy link
Author

i like the idea of adjusting the default for the --route-reconcile-period, but i don't want users to get confused about this.
my only issue with using --min-resync-period is that it sounds much more general and we are just focusing on the route controller

To my understanding, both the service and node controllers are already watch-based and should utilize the --min-resync-period flag. Adopting this approach would bring consistency across the CCM components. In this context, the --route-reconcile-period flag could be considered for deprecation.

@elmiko
Copy link
Contributor

elmiko commented May 27, 2025

ack, thank you @lukasmetzner that makes sense to me. it seems we should focus on --min-resync period then.

@lukasmetzner lukasmetzner requested a review from elmiko June 2, 2025 05:57
@elmiko
Copy link
Contributor

elmiko commented Jun 10, 2025

@lukasmetzner do we need another round of reviews here?

@lukasmetzner
Copy link
Author

@elmiko From my point of view, I am happy with the content, and we could get this merged or move forward.

This is my first contribution, have I missed adding some labels? 😅

@elmiko
Copy link
Contributor

elmiko commented Jun 10, 2025

This is my first contribution, have I missed adding some labels? 😅

no, i don't think so. just wanted to make sure i didn't miss anything. i will give another review before end of week.

@lukasmetzner
Copy link
Author

This is my first contribution, have I missed adding some labels? 😅

no, i don't think so. just wanted to make sure i didn't miss anything. i will give another review before end of week.

Sounds good. With the PRR freeze tomorrow, I guess we won't make it into the v1.34 milestone. I will bump the milestone targets then.

@elmiko
Copy link
Contributor

elmiko commented Jun 11, 2025

Sounds good. With the PRR freeze tomorrow, I guess we won't make it into the v1.34 milestone. I will bump the milestone targets then.

ahh, apologies. i missed that the freeze is today. sorry for the inconvenience and delay.

Copy link
Contributor

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

found a few places to clean things up, also i'd like to ask @JoelSpeed to take a review as well.

@lukasmetzner lukasmetzner force-pushed the 5237-watch-based-route-controller-reconciliation branch from 1833f04 to fb308d4 Compare June 12, 2025 14:18
@elmiko
Copy link
Contributor

elmiko commented Jun 13, 2025

thanks for the updates @lukasmetzner , i probably won't get a chance to review today but i will review again early next week.

Copy link
Contributor

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this is looking good, i'm happy to label it. i would also like @JoelSpeed to take another review.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 18, 2025

We currently use the `Node.Status.Addresses` and `PodCIDRs` fields to trigger updates in the route reconciliation mechanism. However, relying solely on these fields may be insufficient, potentially causing missed route reconciliations when updates are necessary. This depends on the specific cloud-controller-manager implementations. Using these fields works for the CCM maintained by the authors, but we do not know the details of other providers.

This is mitigated by the feature gate `CloudControllerManagerWatchBasedRoutesReconciliation`, which allows infrastructure providers to test it and provide feedback on the fields.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have metrics that can be used to A/B test the count of reconciles before/after this change?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I know, the current CCM implementation doesn't expose metrics to measure this directly. I’d expect cloud providers might be able to observe the impact through their own API request logs.

In our WIP implementation, we’ve introduced a new named workqueue, which does emit some metrics that could be useful for analysis. However, these metrics will only be available when the new feature is enabled.

This is mitigated by:

- not-filtered node event causing a full reconcile of *all* routes
- The controller doing a periodic reconcile to clean up outdated routes, just at a lower frequency than before, even if no events came in.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we see an event for the Node, but we cannot find the Node object, is there a way to clear up the associated routes/cause a full resync?

Is this issue described here associated with every Node removal or, only some?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The node informer periodically (every 12 to 24 hours) emits update events for all nodes in the cluster. If the resource version has not changed, we still enqueue a reconcile call to trigger regular cleanup of stale routes.

The issue described here is associated with every Node removal.

#### Alpha

- Implementation merged in [`k8s.io/cloud-provider`](http://k8s.io/cloud-provider)
- At least one cloud-provider has tested this in their CCM

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we expecting any major changes to be needed by each provider to integrate the new version of the route controller?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s no change to the cloud-provider interface — we’re still calling the same reconcile() method in the routes controller. The only difference is when it gets invoked.

Apart from the new feature gate, I don't expect anything to change for the providers.

…nciliation/README.md

Co-authored-by: Joel Speed <Joel.speed@hotmail.co.uk>
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 25, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: lukasmetzner
Once this PR has been reviewed and has the lgtm label, please ask for approval from elmiko. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just checking back here, i'm still lgtm on this.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 30, 2025
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 2, 2025
@lukasmetzner
Copy link
Author

@elmiko I forgot to update the milestones to now target v1.35 for the alpha.

Copy link
Contributor

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @lukasmetzner !

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 7, 2025
@lukasmetzner
Copy link
Author

@JoelSpeed Just wondering if you’ve had a chance to review the PR again, or if you might be able to soon?

lukasmetzner added a commit to hetznercloud/hcloud-cloud-controller-manager that referenced this pull request Jul 11, 2025
Switch to manually created dependencies for `cloud-provider` and
`controller-manager`. These patches contain a feature gate, which
enables watch-based reconciliation in the routes' controller of
cloud-controller-manager, reducing API calls. Previously the routes
where reconciled every ten seconds.

This feature is for an upcoming preview release, which is used to test
our current upstream contribution at Kubernetes. Two forks of
`cloud-provider` and `controller-manager` are used. The tag `v0.33.2` is
used as a base, with the necessary patches applied manually.

This feature will be revered upon completing our tests.

Upstream WIP PR: kubernetes/kubernetes#131220
Kubernetes Enhancement Proposal:
kubernetes/enhancements#5289
Fork of `cloud-provider`: https://github.com/lukasmetzner/cloud-provider
Fork of `controller-manager`:
https://github.com/lukasmetzner/controller-manager

---------

Co-authored-by: Julian Tölle <julian.toelle@hetzner-cloud.de>
lukasmetzner pushed a commit to hetznercloud/hcloud-cloud-controller-manager that referenced this pull request Jul 11, 2025
This release introduces an experimental feature to address #395.

### Watch-Based Route Reconciliation (Experimental)

Currently, route reconciliation is performed at a fixed interval of 30
seconds. This leads to unnecessary API requests, as a `GET
/v1/networks/{id}` call is triggered every 30 seconds, even when no
changes have occurred.

Upstream we have proposed an event-driven approach, similar to the
mechanism used by other controllers such as the Load Balancer
Controller. With this new approach, route reconciliation is triggered on
node additions, node deletions, or when the `PodCIDRs` or `Addresses` of
nodes change. Additionally, to ensure consistency, reconciliation will
still occur periodically at a randomized interval between 12 and 24
hours.

We are close to merging a [Kubernetes Enhancement Proposal
(KEP)](kubernetes/enhancements#5289).
Furthermore, a [work-in-progress pull
request](kubernetes/kubernetes#131220)
containing the implementation is already open in the Kubernetes
repository.

#### Forked Upstream Libraries

In this release, we replaced the upstream `controller-manager` and
`cloud-provider` libraries with our own forks. These forks are based on
the upstream `v0.33.2` release (aligned with Kubernetes v1.33.2) and
include our patches on top.

#### Enabling the Feature

This feature is **disabled by default** and will not impact existing
deployments unless explicitly enabled. We **do not recommend** running
this feature in production environments at this stage. However, we
welcome early testers who can try it in non-critical setups. Running
with this feature active is enough for us to analyze its impact. No
additional feedback is required.

To enable the feature, set the following Helm value:


`args.feature-gates=CloudControllerManagerWatchBasedRoutesReconciliation=true`

### Features

- watch-based route reconciliation (#970)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants