-
Notifications
You must be signed in to change notification settings - Fork 9
Provision RBAC for Kubeconfigs inside kcp #91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
600c366 to
0a36122
Compare
d14cd21 to
99f7ce4
Compare
On-behalf-of: @SAP christoph.mewes@sap.com
On-behalf-of: @SAP christoph.mewes@sap.com
On-behalf-of: @SAP christoph.mewes@sap.com
On-behalf-of: @SAP christoph.mewes@sap.com
On-behalf-of: @SAP christoph.mewes@sap.com
On-behalf-of: @SAP christoph.mewes@sap.com
…ervice inside kind On-behalf-of: @SAP christoph.mewes@sap.com
On-behalf-of: @SAP christoph.mewes@sap.com
On-behalf-of: @SAP christoph.mewes@sap.com
On-behalf-of: @SAP christoph.mewes@sap.com
On-behalf-of: @SAP christoph.mewes@sap.com
c4d020c to
6fbcf98
Compare
|
/kind feature |
|
/retest |
| clusterRoles: | ||
| items: | ||
| type: string | ||
| type: array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does clusterRoles only reference objects that already exist in the cluster? I am wondering if it could be made possible to also configure RBAC inside the cluster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now this is only binding to pre-existing ClusterRoles, yes.
On-behalf-of: @SAP christoph.mewes@sap.com
|
/retest |
4 similar comments
|
/retest |
|
/retest |
|
/retest |
|
/retest |
On-behalf-of: @SAP christoph.mewes@sap.com
|
/merge-method squash |
| //go:build e2e | ||
|
|
||
| /* | ||
| Copyright 2025 The KCP Authors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean kcp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrhrhrhrhrhrhrhr got me. However kcp-dev/kcp#3665 says
I did not touch the boilerplate header as I am not sure about the CNCF rammifications, but I would personally of course also change them. If we can.
|
LGTM label has been added. Git tree hash: 654f43b28b5bb40f956a6f3e363d93b10caa1d77
|
| /* | ||
| Copyright 2025 The KCP Authors. | ||
| Licensed under the Apache License, Version 2.0 (the "License"); | ||
| you may not use this file except in compliance with the License. | ||
| You may obtain a copy of the License at | ||
| http://www.apache.org/licenses/LICENSE-2.0 | ||
| Unless required by applicable law or agreed to in writing, software | ||
| distributed under the License is distributed on an "AS IS" BASIS, | ||
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| See the License for the specific language governing permissions and | ||
| limitations under the License. | ||
| */ | ||
|
|
||
| package client | ||
|
|
||
| import ( | ||
| "context" | ||
| "fmt" | ||
|
|
||
| "github.com/kcp-dev/logicalcluster/v3" | ||
|
|
||
| corev1 "k8s.io/api/core/v1" | ||
| "k8s.io/apimachinery/pkg/runtime" | ||
| "k8s.io/apimachinery/pkg/types" | ||
| "k8s.io/client-go/rest" | ||
| ctrlruntimeclient "sigs.k8s.io/controller-runtime/pkg/client" | ||
|
|
||
| "github.com/kcp-dev/kcp-operator/internal/resources" | ||
| operatorv1alpha1 "github.com/kcp-dev/kcp-operator/sdk/apis/operator/v1alpha1" | ||
| ) | ||
|
|
||
| func NewRootShardClient(ctx context.Context, c ctrlruntimeclient.Client, rootShard *operatorv1alpha1.RootShard, cluster logicalcluster.Name, scheme *runtime.Scheme) (ctrlruntimeclient.Client, error) { | ||
| baseUrl := fmt.Sprintf("https://%s.%s.svc.cluster.local:6443", resources.GetRootShardServiceName(rootShard), rootShard.Namespace) | ||
|
|
||
| if !cluster.Empty() { | ||
| baseUrl = fmt.Sprintf("%s/clusters/%s", baseUrl, cluster.String()) | ||
| } | ||
|
|
||
| return newClient(ctx, c, baseUrl, scheme, rootShard, nil, nil) | ||
| } | ||
|
|
||
| func NewRootShardProxyClient(ctx context.Context, c ctrlruntimeclient.Client, rootShard *operatorv1alpha1.RootShard, cluster logicalcluster.Name, scheme *runtime.Scheme) (ctrlruntimeclient.Client, error) { | ||
| baseUrl := fmt.Sprintf("https://%s.%s.svc.cluster.local:6443", resources.GetRootShardProxyServiceName(rootShard), rootShard.Namespace) | ||
|
|
||
| if !cluster.Empty() { | ||
| baseUrl = fmt.Sprintf("%s/clusters/%s", baseUrl, cluster.String()) | ||
| } | ||
|
|
||
| return newClient(ctx, c, baseUrl, scheme, rootShard, nil, nil) | ||
| } | ||
|
|
||
| func NewShardClient(ctx context.Context, c ctrlruntimeclient.Client, shard *operatorv1alpha1.Shard, cluster logicalcluster.Name, scheme *runtime.Scheme) (ctrlruntimeclient.Client, error) { | ||
| baseUrl := fmt.Sprintf("https://%s.%s.svc.cluster.local:6443", resources.GetShardServiceName(shard), shard.Namespace) | ||
|
|
||
| if !cluster.Empty() { | ||
| baseUrl = fmt.Sprintf("%s/clusters/%s", baseUrl, cluster.String()) | ||
| } | ||
|
|
||
| return newClient(ctx, c, baseUrl, scheme, nil, shard, nil) | ||
| } | ||
|
|
||
| func newClient( | ||
| ctx context.Context, | ||
| c ctrlruntimeclient.Client, | ||
| url string, | ||
| scheme *runtime.Scheme, | ||
| // only one of these three should be provided, the others nil | ||
| rootShard *operatorv1alpha1.RootShard, | ||
| shard *operatorv1alpha1.Shard, | ||
| frontProxy *operatorv1alpha1.FrontProxy, | ||
| ) (ctrlruntimeclient.Client, error) { | ||
| tlsConfig, err := getTLSConfig(ctx, c, rootShard, shard, frontProxy) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to determine TLS settings: %w", err) | ||
| } | ||
|
|
||
| cfg := &rest.Config{ | ||
| Host: url, | ||
| TLSClientConfig: tlsConfig, | ||
| } | ||
|
|
||
| return ctrlruntimeclient.New(cfg, ctrlruntimeclient.Options{Scheme: scheme}) | ||
| } | ||
|
|
||
| // +kubebuilder:rbac:groups=core,resources=secrets,verbs=get | ||
|
|
||
| func getTLSConfig(ctx context.Context, c ctrlruntimeclient.Client, rootShard *operatorv1alpha1.RootShard, shard *operatorv1alpha1.Shard, frontProxy *operatorv1alpha1.FrontProxy) (rest.TLSClientConfig, error) { | ||
| rootShard, err := getRootShard(ctx, c, rootShard, shard, frontProxy) | ||
| if err != nil { | ||
| return rest.TLSClientConfig{}, fmt.Errorf("failed to determine effective RootShard: %w", err) | ||
| } | ||
|
|
||
| // get the secret for the kcp-operator client cert | ||
| key := types.NamespacedName{ | ||
| Namespace: rootShard.Namespace, | ||
| Name: resources.GetRootShardCertificateName(rootShard, operatorv1alpha1.OperatorCertificate), | ||
| } | ||
|
|
||
| certSecret := &corev1.Secret{} | ||
| if err := c.Get(ctx, key, certSecret); err != nil { | ||
| return rest.TLSClientConfig{}, fmt.Errorf("failed to get root shard proxy Secret: %w", err) | ||
| } | ||
|
|
||
| return rest.TLSClientConfig{ | ||
| CAData: certSecret.Data["ca.crt"], | ||
| CertData: certSecret.Data["tls.crt"], | ||
| KeyData: certSecret.Data["tls.key"], | ||
| }, nil | ||
| } | ||
|
|
||
| // +kubebuilder:rbac:groups=operator.kcp.io,resources=rootshards,verbs=get | ||
|
|
||
| func getRootShard(ctx context.Context, c ctrlruntimeclient.Client, rootShard *operatorv1alpha1.RootShard, shard *operatorv1alpha1.Shard, frontProxy *operatorv1alpha1.FrontProxy) (*operatorv1alpha1.RootShard, error) { | ||
| if rootShard != nil { | ||
| return rootShard, nil | ||
| } | ||
|
|
||
| var ref *corev1.LocalObjectReference | ||
|
|
||
| switch { | ||
| case shard != nil: | ||
| ref = shard.Spec.RootShard.Reference | ||
|
|
||
| case frontProxy != nil: | ||
| ref = frontProxy.Spec.RootShard.Reference | ||
|
|
||
| default: | ||
| panic("Must be called with either RootShard, Shard or FrontProxy.") | ||
| } | ||
|
|
||
| rootShard = &operatorv1alpha1.RootShard{} | ||
| if err := c.Get(ctx, types.NamespacedName{Namespace: rootShard.Namespace, Name: ref.Name}, rootShard); err != nil { | ||
| return nil, fmt.Errorf("failed to get RootShard: %w", err) | ||
| } | ||
|
|
||
| return rootShard, nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uh. So if somebody misconfigures somebody one reference, we get into a panic loop? can we error here? Let's say we add a cache server in the future, forget to update this switch and boom...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you misread the code a bit (it's not my prettiest, but it's the best I could do).
Any misconfigured references will be caught and reported as errors in NewInternalKubeconfigClient. This is where the refs are checked and the appropriate functions are being called. It's impossible for the code to panic just because of a misconfigured object. The panic occurs when a developer calls the function wrong. getRootShard is only called by getTLSConfig, which is only called by the 3 explicit helper functions. And calling any of them with nil is also still a developer error and not something that should be reported as a runtime issue.
add a cache server in the future, forget to update this switch and boom...
That is exactly why it's a panic. So we do not forget.
mjudeikis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove that one panic from the reconciler? :)
Summary
This PR implements a new feature for the operator: It can now provision RBAC inside kcp and thereby grant people permissions and take them away, too.
This is now the first PR to make use of the internal-proxy (#87): since admins can configure any random workspace path or cluster name, the operator needs to be able to provision on any shard and more importantly, figure out which shard. To solve this, we use our internal proxy ("internal" still means a standalone Deployment, of course).
Each Kubeconfig object can now hold a workspace and a desired list of permissions inside that workspace. The operator will try to reconcile these RBAC resources accordingly, and also take care of cleaning up when a Kubeconfig is removed or changed (it's possible for users to change the workspace that RBAC should be placed in, and the operator will first cleanup the old cluster and then provision the new one).
To keep track of where RBAC has been deployed, a new field in the Kubeconfig status has been introduced. We discussed this and decided that this is a safe place to do so, as anyone with permissions to manage Kubeconfigs is technically an admin and so endusers cannot/should not fiddle with Kubeconfigs. If that were possible, the operator currently has no way of defending against malicious changes.
Each kubeconfig manages its own RBAC and all resources inside kcp are named based on the UID of the Kubeconfig object. This ensures uniqueness all around and avoids having to merge desired RBACs into one ClusterRole(Binding) and unfiddle them when RBAC for one Kubeconfig is removed.
Notably, since the kcp-operator now has to talk with shards and the front-proxy, this PR modifies the local e2e setup to work like the CI e2e test: build an operator image and deploy it into kind, rather than running the operator on the host machine. This is a bit sad for quick debugging tests, but saves us from somehow having to either dynamically expose the pods through kind to the host, or rewriting URLs in the operator somehow.
What Type of PR Is This?
/kind feature
Related Issue(s)
Fixes #49
Release Notes