Skip to content

Implement UserInfo Endpoint Integration for User Data Enrichment #2227

@jrschumacher

Description

@jrschumacher

Implement UserInfo Endpoint Integration for User Data Enrichment

1. Problem Statement / Background

Currently, our platform relies on Keycloak as an Identity Provider (IdP). Keycloak offers advanced features like custom claims mappers, which allow embedding user identifiers directly into access tokens. While functional, this approach is generally considered an anti-pattern for several reasons:

  • Access Token Purpose: Access tokens are primarily intended for authorization decisions (determining what a user can do), not as general-purpose identity information carriers.
  • Token Bloat: Including extensive user information can lead to larger, more cumbersome access tokens.
  • OIDC Best Practices: Standard OIDC practice is to use the id_token for client-side identity information and the UserInfo endpoint for retrieving more comprehensive user claims by the Relying Party (our platform backend).
  • id_token Limitation: While other IdPs provide user information via the id_token, this token is intended for the client (Relying Party frontend) and should not be directly passed to or validated by backend resource servers for identity claims.

2. Proposed Solution

To align with OIDC best practices and provide a more robust mechanism for fetching user information, we propose adding support for retrieving user claims from the IdP's UserInfo endpoint.

The platform's backend services will use the access_token provided by the client to query the UserInfo endpoint.

Key Technical Requirement: Token Exchange for DPoP Compatibility
To ensure compatibility with IdPs or configurations where DPoP (Demonstration of Proof-of-Possession) is enabled, the platform will perform a token exchange (as per RFC 8693). This involves exchanging the client-provided access_token (potentially DPoP-bound to the client) for a new token that the platform backend can use to securely access the UserInfo endpoint. This new token may be DPoP-bound to the backend's key or be a simpler bearer token if the IdP allows.

After a successful token exchange, the platform will use the exchanged token to fetch user claims from the UserInfo endpoint.

3. Caching Strategy

To minimize latency and reduce the number of round trips to the IdP's UserInfo endpoint, fetched user information will be cached.

  • Cache Scope: Caching will be implemented in-memory within each service instance. This approach reduces initial complexity, deferring the implementation of a distributed cache solution for later consideration if deemed necessary.
  • Cache Library: We will utilize eko/gocache as the caching abstraction layer, leveraging its support for various cache drivers.
  • In-Memory Provider: For the in-memory implementation, hypermodeinc/ristretto will be used due to its performance characteristics.
  • Cache Key: To ensure data integrity and prevent collisions, cache entries will be keyed by a composite of the issuer (IdP's issuer URL) and the sub (user's subject identifier). This uniquely identifies a user across potentially multiple IdPs.
  • Cache Lifetime: The TTL (Time-To-Live) for cache entries will be configurable.

4. Configuration

The following configuration parameters will be introduced to allow administrators to control this feature:

  • server.auth.userinfoEnrichment (boolean):

    • Description: Enables or disables fetching user information from the UserInfo endpoint.
    • Rationale: Administrators who have already configured custom claims mappers to include necessary identifiers in the access_token may choose to disable this feature to avoid the additional network round trip.
    • Default: false (Opt-in for the new feature).
  • server.auth.userinfoCacheTTL (duration string, e.g., "5m", "1h"):

    • Description: Specifies the Time-To-Live for cached UserInfo responses.
    • Default: "5m" (5 minutes).

5. Implementation Guidance / Developer Notes

  • Service Responsibility: The majority of the code related to handling the token exchange and fetching user information will reside within the entity resolution service (ERS).
  • Strategic Alignment: This placement aligns with the nature of the ERS and supports the long-term vision of creating a generic OIDC UserInfo ERS. This future ERS would be compatible with most IdPs and serve as a complementary approach to the existing ERS claims mode.

6. Documentation Requirements

  • Draft comprehensive documentation explaining the new UserInfo enrichment feature and its configuration options (server.auth.userinfoEnrichment, server.auth.userinfoCacheTTL).
  • Important Note for Administrators: Include a prominent note in the documentation advising administrators against over-privileging the platform's OIDC client. The opentdf-standard scope (or an equivalent minimal scope required for token exchange and UserInfo access) should generally be sufficient. Clearly explain the principle of least privilege in this context.

7. Acceptance Criteria

  • When server.auth.userinfoEnrichment is true:
    • The entity resolution service attempts to fetch user information from the IdP's UserInfo endpoint using the client-provided access_token.
    • A token exchange is performed prior to calling the UserInfo endpoint to handle DPoP and obtain a suitable token for the backend.
    • Successfully fetched UserInfo responses are cached in-memory within the ERS, keyed by issuer and sub.
    • Cache entries expire according to the server.auth.userinfoCacheTTL setting.
    • Subsequent requests for the same user (same issuer and sub) within the TTL serve data from the cache.
  • When server.auth.userinfoEnrichment is false:
    • No attempt is made to call the UserInfo endpoint or perform a token exchange for this purpose.
  • Errors during token exchange or UserInfo endpoint calls are handled gracefully (e.g., logged, and the system proceeds without enriched user info if possible, or returns an appropriate error if the info is critical).
  • The system correctly uses the exchanged token (not the original client token) when calling the UserInfo endpoint.
  • Code related to token exchange and UserInfo fetching is primarily located within the entity resolution service.
  • Documentation for administrators regarding this feature and its configuration is created.
  • Documentation includes a warning about not over-privileging the platform's OIDC client.

8. Out of Scope (for this issue)

  • Implementation of a distributed cache for UserInfo.
  • Modifying client-side handling of id_token.
  • Full implementation of the "generic OIDC Userinfo ERS" (this issue lays foundational work).

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions