Per-workspace Authentication #3481

xrstf · 2025-07-11T18:41:33Z

Summary

This PR implements a new feature for kcp: per-workspace authentication. Admins can configure authentication configurations (as known from Kubernetes) for each workspace type and if a request targets a logicalcluster for which the type has auth configs, then this new authentication mechanism will form a union with the existing kcp authentication.

This feature is guarded by a new alpha feature gate WorkspaceAuthentication, disabled by default. It can be set on each shard and each front-proxy independently.

Feature Overview

The primary intended usecase for this is inside the front-proxy. The proxy has a global view and can reliably resolve the necessary objects across all shards. If desired admins can even setup a dedicated front-proxy that has this new feature enabled.

However, to make e2e testing more convenient, the feature is also part of a shard's localproxy (if the feature is enabled on that shard, of course). In this case, the authenticator will only work if all of AuthConfig, WorkspaceType, Workspace and LogicalCluster are on the same shard. It purposefully does not try to gain a global view across all shards and this is meant to be done by using front-proxies.

API

The new API type WorkspaceAuthenticationConfiguration is based entirely on kube-apiserver's AuthenticationConfiguration, which could not be used because of a) the heavy dependency and b) because it doesn't have any json tags.

There is a primitive set of conversion functions to convert from our API to Kube API, in order to then construct the OIDC authenticators.

Architecture

kcp's front-proxy already comes with an index (a collection of maps of shards, clusters, workspaces etc.) and an index_controller (responsible for connecting dynamically to each shard, watch these resources on them and feed the index). This index is used to resolve incoming workspace paths and/or logicalclusters to their target shard.

This index was extended to also keep track of the workspace type for each logicalcluster (thankfully the fully-qualified workspace type is stored as an annotation on each LogicalCluster) and will include this information when calling the Lookup() function.

The index and its controller are cloned for this new feature, forming the authIndex and authIndexController. These do similar things, but keep track of WorkspaceTypes and AuthConfigs. The authIndex builds and starts the authenticators for each AuthConfig, since the OIDC authentication in Kubernetes is stateful and requires discovery first, so it's not possible to just statelessly build an authenticator and use it only for one request.

The authIndex then ultimately also has a Lookup(workspaceType) function, which will resolve the type, find the auth configs and then assemble them into a single authenticator union.

Important: For a token to be accepted, it has to contain both the configured kcp audience (from the CLI flags) and the audiences configured in the AuthenticationConfiguration objects. It's simply not possible in Kube to have the "core" audiences be ignored because of the way Kube handles audience validation. I think this also makes sense to require tokens to at least specify that they are meant to be used in kcp.

Kube does audience validation in a weird way. It stores the *REQUIRED* audiences in the request context, but named `audiences`, leading you to believe these are the audiences that came with the request. The existence of this context field will make Kube validate that the incoming token has this audience set. There is no way to disable this behaviour when constructing a Kube OIDC authenticator.

Even worse: Even if you hack around this and replace the context of a request entirely before calling our own per-workspace authenticator, then this authenticator's response will only contain the audiences configured in the AuthenticationConfigurations. This means the core kcp audience will be missing from the auth result.

BUT Kube's authenticator is built to check that after calling the underlying authenticator (us), the audiences match what is in the request context. This context it has remembered and is out of our reach entirely. So if we cheated with a custom context, the surrounding check by Kube would fail again.

Request Handling

Importantly, the front-proxy has the concept of optional authentication, where the authenticator is only called if additional authn flags were configured on the front-proxy. Otherwise unauthenticated requests are passed on to the shards.

With this feature, the existence of an authenticator is not a static decision anymore, but depends on the target logicalcluster.

To allow for this, this PR...

refactors the proxy's mapping handler to not perform the index lookup and forwarding in the same code, but instead have a new lookup middleware that is only responsible for calling index.Lookup() and storing the logicalcluster, shard and now workspacetype in the request's context. It's necessary to have this information in the context because...
a new AuthResolver middleware is added early (but after lookup) into the handler chain of both the front-proxy and the kcp shard server. This new middleware will take the looked up workspacetype and ask the authIndex for the relevant authenticator. If one is found, it is stored in the request context. Then...
in the front-proxy, the optionalAuthentication middleware now takes the optional authenticator out of the context and combines it smartly with the other CLI-based auth configs,
in the kcp shard server, the existing Authenticator is unioned with a WorkspaceAuthenticator, which in itself just takes the authenticator out of the context and calls it.

Notable, in the shard/localproxy scenario, there is no authIndexController managing the watches per shard, since there is only the local shard. This means the kcp server has to start one single ShardWatcher directly, whereas the front-proxy has the authIndexController to manage ShardWatchers dynamically. ShardWatchers as an abstraction only exist because of this stripped-down behaviour in the localproxy.

Testing

I've added two new e2e tests, covering the basics of this feature. For testing I decided to use mockoidc, which allows to run an OIDC server in-process, which is just super convenient. However the upstream module has some bugs, so I had to fork it and fix those.

(This makes me wonder if httest should also come as a Go library, so our authorizer tests do have to manage fewer external processes...)

What Type of PR Is This?

/kind feature

Related Issue(s)

Fixes #3428

Release Notes

Add per-workspace authentication feature (behind the disabled by default feature gate `WorkspaceAuthentication`), allowing to configure additional authenticators (JWT/OIDC at the moment) for workspace types in order to admit external users into logical clusters.

xrstf · 2025-08-02T21:47:02Z

/retest

xrstf · 2025-08-03T16:18:18Z

/retest

xrstf · 2025-08-03T16:59:46Z

/retest

xrstf · 2025-08-03T17:42:33Z

/retest

embik

A couple of notes. In addition, I have two important topics I couldn't quite place in the code:

We should make sure we attach a scope to authenticated users in a workspace, so that their identity is only valid in the workspace they have authenticated for.
What happens with the mini-front-proxy in the server when a WorkspaceType / AuthenticationConfig in a different shard is referenced? Is that (not) supposed to work?

sdk/apis/tenancy/v1alpha1/types_workspacetype.go

test/e2e/authentication/workspace_test.go

pkg/authentication/authentication_controller.go

pkg/server/config.go

xrstf · 2025-08-06T16:56:29Z

I've added that functionality and added an e2e test to verify it.
Yes, that's correct. The localproxy auth functionality only works if Workspace, WorkspaceType, LogicalCluste and WorkspaceAuthentication exist on the local shard. There is purposefully no intention to change this, as I would simply recommend using the front-proxy if you want this feature across all your shards. That's at least how I intended it.

xrstf · 2025-08-08T11:45:43Z

/retest

mjudeikis

More questions than issues :) looks very promising!

mjudeikis · 2025-08-11T06:13:57Z

Makefile

@@ -354,7 +355,7 @@ test-run-sharded-server: LOG_DIR ?= $(WORK_DIR)/.kcp
 test-run-sharded-server:
 	mkdir -p "$(LOG_DIR)" "$(WORK_DIR)/.kcp"
 	rm -f "$(WORK_DIR)/.kcp/ready-to-test"
-	UNSAFE_E2E_HACK_DISABLE_ETCD_FSYNC=true NO_GORUN=1 ./bin/sharded-test-server --quiet --v=2 --log-dir-path="$(LOG_DIR)" --work-dir-path="$(WORK_DIR)" --shard-run-virtual-workspaces=false --shard-feature-gates=$(TEST_FEATURE_GATES) $(TEST_SERVER_ARGS) --number-of-shards=2 2>&1 & PID=$$!; echo "PID $$PID" && \
+	UNSAFE_E2E_HACK_DISABLE_ETCD_FSYNC=true NO_GORUN=1 ./bin/sharded-test-server --quiet --v=2 --log-dir-path="$(LOG_DIR)" --work-dir-path="$(WORK_DIR)" --shard-run-virtual-workspaces=false --shard-feature-gates=$(TEST_FEATURE_GATES) --proxy-feature-gates=$(PROXY_FEATURE_GATES) $(TEST_SERVER_ARGS) --number-of-shards=2 2>&1 & PID=$$!; echo "PID $$PID" && \


With second flag this looks strange now. 2 featureflags flags. Should we maybe get ticket to deprecate --shard-feature-gates in favor of --feature-gates and use use one everywhere?

Because I needed to be able to control the feature independently between shard and front-proxy, because some tests are meant to ensure that the front-proxy does the authentication. That's why the sharded-test-server has a new CLI flag. The front-proxy's flag is called --feature-gate, like the shard's CLI flag is. Only on the test server do we need 2 sets of feature gates, hence 2 CLI flags.

cmd/kcp-front-proxy/options/options.go

docs/content/concepts/authentication/workspace.md

mjudeikis · 2025-08-11T06:18:34Z

docs/content/concepts/authentication/workspace.md

+
+Workspace types then reference a set of auth configs, and their configuration will apply to all their workspaces/logicalclusters. Compared to many other settings in a `WorkspaceType` that work only as a preset for _new_ workspaces, the configured auth configs will continue to affect workspaces, so when a `WorkspaceType` is changed, this will impact existing workspaces, too.
+
+For every incoming HTTPS request, kcp will then resolve the logicalcluster, determine the used workspace type, assemble the list of auth configs and create an authenticator suitable for exactly the one logicalcluster targeted by the request. This workspace authenticator is an *alternative* to kcp's regular authentication (i.e. it forms a union with it).


What is behaviour when you dont have set any core authentication and only AuthConfig?
One might not even have auth flags in kcp and just AuthConfig?

If you do not have any auth flags configured on the front-proxy, it will pass through all requests targetting clusters without AuthConfigs. For clusters with AuthConfigs (via their types), the front-proxy will perform authentication and reject the request if it cannot be authenticated.

Without any auth flags, it will be difficult to access the root workspace I think.

Without any auth flags, it will be difficult to access the root workspace I think.
you could just use admin kubeconfig as admin. Its a security feature - you know there is no OIDC access to anything else apart workspaces, using it.

docs/content/concepts/authentication/workspace.md

mjudeikis · 2025-08-11T06:21:43Z

docs/content/concepts/authentication/workspace.md

+subjects:
+  - apiGroup: rbac.authorization.k8s.io
+    kind: Group
+    name: oidc:admins


Your example does not have oidc: prefix. This will not work, right?

Can one create group system:masters or system:kcp:admin` in external authentication OIDC and so get wider access than one should?

Thinking if we should force everybody to use prefix for authentication configs? Or prevent certain groups in these authConfigs.

@embik wdyt?

Hm, good question. We might have to do that indeed. I wonder if we should have a prefix prepended that is simply the cluster ID, even though that would be massively annoying for creating RBAC ...

I'm not sure this is a problem. Yes the example could show the username claim with the "oidc:" prefix, but if you have AuthConfigs that do not configure a claim prefix, then yes, users could get any arbitrary group assigned to them. But their authentication is limitted (via the scope annotation thingy on the UserInfo) to just one single logicalcluster, so it should be fine. And who are we to prevent kcp users from giving someone poweruser permissions inside their clusters?

But I'm open to explicit rules you guys want to see here.

We have to avoid privileged identities (the one @mjudeikis mentioned) because they can reach past the workspace and circumvent authorization. I'm not sure that scopes are enough for keeping them in check. Maybe the authenticator should reject anything with a system: prefix.

This is the point, They might be scoped to workspace, but Im not even sure what other "holes" they could poke with this.

Do we want to reject non-prefixed AuthSettings or eventually forbid the WorkspaceAuthenticator from returning a set of protected groups, regardless of the OIDC settings?

I've added a wrapper that removes system: groups and rejects system: names outright. A new e2e test ensures that this works.

pkg/index/index.go

pkg/proxy/server.go

On-behalf-of: @SAP christoph.mewes@sap.com

…re for localproxy My previous attempt at indexing the workspace types was not actually properly shard-aware. This commit fixes that and moves the code to pkg/authentication/ since it wil be soon be re-used by the localproxy. On-behalf-of: @SAP christoph.mewes@sap.com

…also have the feature working This is mainly thought to make `make test-e2e` work properly. My main idea is still to enable this feature on front-proxies, not per-shard. On-behalf-of: @SAP christoph.mewes@sap.com

On-behalf-of: @SAP christoph.mewes@sap.com

* the service account resolver needs to run before our authentication * additionalAuth alone is not alone; it actually depends on the target cluster whether authentication is supposed to happen, i.e. it's request-dependent. Rework to introduce a new middleware that finds the authenticator before the authentication middleware, so it can make an informed decision. On-behalf-of: @SAP christoph.mewes@sap.com

On-behalf-of: @SAP christoph.mewes@sap.com

…works On-behalf-of: @SAP christoph.mewes@sap.com

On-behalf-of: @SAP christoph.mewes@sap.com

embik

/approve

kcp-ci-bot · 2025-08-21T07:40:52Z

LGTM label has been added.

Git tree hash: 1961af325ace6f7c1b219e0d74737b78ec5fc0f1

kcp-ci-bot · 2025-08-21T07:40:54Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: embik

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [embik]
~~config/crds/OWNERS~~ [embik]
~~sdk/apis/OWNERS~~ [embik]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

embik · 2025-08-21T12:09:45Z

/retest

xrstf changed the title ~~WIP - Per workspace authn~~ WIP - Per-workspace Authentication Jul 11, 2025

xrstf force-pushed the per-workspace-authn branch 3 times, most recently from f5ec0ce to 2092bcc Compare August 2, 2025 21:13

xrstf force-pushed the per-workspace-authn branch from 2092bcc to 51a4db2 Compare August 3, 2025 15:25

embik reviewed Aug 6, 2025

View reviewed changes

xrstf force-pushed the per-workspace-authn branch from f1ad557 to 1c72f76 Compare August 6, 2025 16:54

kcp-ci-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Aug 6, 2025

xrstf changed the title ~~WIP - Per-workspace Authentication~~ Per-workspace Authentication Aug 7, 2025

kcp-ci-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 7, 2025

mjudeikis reviewed Aug 11, 2025

View reviewed changes

ntnn mentioned this pull request Aug 12, 2025

[WIP] Add system:cluster:<cluster> group on bind requests #3530

Open

kcp-ci-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 14, 2025

xrstf added 4 commits August 14, 2025 16:12

only resolve URLs once instead of once per mapping

12a1a39

On-behalf-of: @SAP christoph.mewes@sap.com

remove redundant and wrong status code handling

f84efcc

On-behalf-of: @SAP christoph.mewes@sap.com

WIP: middleware to resolve clusters

dc95aab

On-behalf-of: @SAP christoph.mewes@sap.com

Add WorkspaceAuthenticationConfiguration

d1e378f

On-behalf-of: @SAP christoph.mewes@sap.com

xrstf added 16 commits August 14, 2025 16:12

add basic e2e test to ensure custom authentication for a workspace works

16183bc

On-behalf-of: @SAP christoph.mewes@sap.com

lint

b1f96fb

On-behalf-of: @SAP christoph.mewes@sap.com

enable feature in shared e2e tests

8a1938d

On-behalf-of: @SAP christoph.mewes@sap.com

cleanup unit test

7a89236

On-behalf-of: @SAP christoph.mewes@sap.com

constraint testcase to new workspace to allow multiple runs

74e8a17

On-behalf-of: @SAP christoph.mewes@sap.com

cancel and stop shard watchers when servers are stopped

d247b51

On-behalf-of: @SAP christoph.mewes@sap.com

remove unneeded parameter since we're now relying on the context

be2d5ff

On-behalf-of: @SAP christoph.mewes@sap.com

improve error handling

32a630f

On-behalf-of: @SAP christoph.mewes@sap.com

rename ref field from Configuration to Name

5e8d9e2

On-behalf-of: @SAP christoph.mewes@sap.com

extend e2e testcases

39d6d4e

On-behalf-of: @SAP christoph.mewes@sap.com

scope authn infos to the given logicalcluster, add test to ensure it …

242e852

…works On-behalf-of: @SAP christoph.mewes@sap.com

add documentation

1c11747

On-behalf-of: @SAP christoph.mewes@sap.com

prevent workspace authentication from returning system groups or names

46b1c0c

On-behalf-of: @SAP christoph.mewes@sap.com

xrstf force-pushed the per-workspace-authn branch from b6cd80d to 46b1c0c Compare August 15, 2025 09:26

kcp-ci-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 15, 2025

xrstf requested review from embik and mjudeikis August 15, 2025 09:39

ntnn mentioned this pull request Aug 19, 2025

feature: add logical cluster pseudo groups to cross-workspace authorization checks #3513

Open

1 task

embik approved these changes Aug 21, 2025

View reviewed changes

kcp-ci-bot assigned embik Aug 21, 2025

kcp-ci-bot added the lgtm Indicates that a PR is ready to be merged. label Aug 21, 2025

kcp-ci-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 21, 2025

kcp-ci-bot merged commit 339e173 into kcp-dev:main Aug 21, 2025
14 checks passed

xrstf deleted the per-workspace-authn branch August 25, 2025 09:11


		Workspace types then reference a set of auth configs, and their configuration will apply to all their workspaces/logicalclusters. Compared to many other settings in a `WorkspaceType` that work only as a preset for _new_ workspaces, the configured auth configs will continue to affect workspaces, so when a `WorkspaceType` is changed, this will impact existing workspaces, too.

		For every incoming HTTPS request, kcp will then resolve the logicalcluster, determine the used workspace type, assemble the list of auth configs and create an authenticator suitable for exactly the one logicalcluster targeted by the request. This workspace authenticator is an alternative to kcp's regular authentication (i.e. it forms a union with it).

Per-workspace Authentication #3481

Per-workspace Authentication #3481

Uh oh!

Conversation

xrstf commented Jul 11, 2025 • edited by embik Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Feature Overview

API

Architecture

Request Handling

Testing

What Type of PR Is This?

Related Issue(s)

Release Notes

Uh oh!

xrstf commented Aug 2, 2025

Uh oh!

xrstf commented Aug 3, 2025

Uh oh!

xrstf commented Aug 3, 2025

Uh oh!

xrstf commented Aug 3, 2025

Uh oh!

embik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

xrstf commented Aug 6, 2025

Uh oh!

xrstf commented Aug 8, 2025

Uh oh!

mjudeikis left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

embik Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

embik left a comment

Choose a reason for hiding this comment

Uh oh!

kcp-ci-bot commented Aug 21, 2025

Uh oh!

xrstf commented Jul 11, 2025 •

edited by embik

Loading

embik Aug 11, 2025 •

edited

Loading