Skip to content

Conversation

@yonglingsong
Copy link

Problem:

Domain-based authorization policies were ineffective during entity creation/modification
Authorization checks occurred BEFORE domain attachment, causing policy bypass
Multi-tenant isolation couldn't be enforced at creation time

Solution:

UNION Strategy: Combines existing domains + new domains for authorization evaluation
Feature-gated with DOMAIN_BASED_AUTHORIZATION_ENABLED (default: disabled)
Falls back to original logic when disabled
Impact:

✅ Domain-based policies now work correctly for entity creation
✅ No authorization bypass during domain transitions
✅ Backward compatible (disabled by default)
✅ Performance-conscious (domain collection only when enabled)

Testing:

Existing tests pass
New tests for UNION-based evaluation
Full end to end tests with Iceberg ingestion

@github-actions github-actions bot added ingestion PR or Issue related to the ingestion of metadata product PR or Issue related to the DataHub UI/UX devops PR or Issue related to DataHub backend & deployment community-contribution PR or Issue raised by member(s) of DataHub Community labels Nov 14, 2025
@datahub-cyborg datahub-cyborg bot added the needs-review Label for PRs that need review from a maintainer. label Nov 14, 2025
@yoonhyejin
Copy link
Collaborator

Hi @yonglingsong, thanks for raising the PR! could you take a look at this PR @chakru-r @david-leifker ?

@datahub-cyborg datahub-cyborg bot added pending-submitter-response Issue/request has been reviewed but requires a response from the submitter and removed needs-review Label for PRs that need review from a maintainer. labels Nov 18, 2025
* @return whether the aspect proposal is valid
*/
public final Stream<AspectValidationException> validateProposed(
@Nonnull Collection<? extends BatchItem> mcpItems,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach is not secure because the validateProposed method is executed outside of a transaction and the aspect lookup is subject to caching. The validation of a non-existent domain can only safely be done within a database transaction, otherwise it is subject to race conditions which could lead to privilege escalation.

*/
private Set<Urn> getEntityDomains(Urn entityUrn, AspectRetriever aspectRetriever) {
try {
Aspect domainsAspect = aspectRetriever.getLatestAspectObject(entityUrn, DOMAINS_ASPECT_NAME);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the read which is not protected by a transaction and subject to caching.


// 0. Determine if we have a wildcard policy.
if (actorFilter.isAllUsers()) {
allUsers = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might have been designed to imply that that all users == all groups. I think we might want allGroups = true in this condition as well.

On the second conditional this makes sense to fix.

batchSize: ${BOOTSTRAP_SYSTEM_UPDATE_OWNERSHIP_TYPES_BATCH_SIZE:1000}
reprocess:
enabled: ${BOOTSTRAP_SYSTEM_UPDATE_OWNERSHIP_TYPES_REPROCESS:false}
lineageIndexFields:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe we should be removing unrelated configuration here.

@@ -1,14 +1,25 @@
package com.linkedin.metadata.resources.entity;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid putting authorization logic in only 1 of the 3 APIs. This only covers rest.li (deprecated) API. The logic should be included in the authorization util classes so it can be used by OpenAPI and GraphQL. We don't want to apply access controls to just one of the APIs.

@@ -1,6 +1,9 @@
package com.linkedin.metadata.resources.entity;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the AspectResource, we don't want to include authorization logic only in rest.li and need to support the other APIs like OpenAPI and Graphql. Moving the logic to the authorization util classes and using the util in the other APIs is ideal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution PR or Issue raised by member(s) of DataHub Community devops PR or Issue related to DataHub backend & deployment ingestion PR or Issue related to the ingestion of metadata pending-submitter-response Issue/request has been reviewed but requires a response from the submitter product PR or Issue related to the DataHub UI/UX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants