Skip to content
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
67fdaca
adding support to use a kms key for s3 buckets
fabio-rizzo-01 Oct 10, 2025
5cd4c4d
Update CHANGELOG.md
fabio-rizzo-01 Oct 13, 2025
7db0a87
Merge branch 'main' into feature/s3-kms-key
fabio-rizzo-01 Oct 13, 2025
cf4ec3e
Update CHANGELOG.md
fabio-rizzo-01 Oct 13, 2025
ffbe979
Update CHANGELOG.md
fabio-rizzo-01 Oct 13, 2025
5a0f3eb
fixing indentation issue
fabio-rizzo-01 Oct 13, 2025
708630c
fixing indentation issue
fabio-rizzo-01 Oct 13, 2025
3a10f7d
temporary changes to handle iceberg s3 properties
fabio-rizzo-01 Oct 21, 2025
543dabc
Merge remote-tracking branch 'private/feature/s3-kms-key' into featur…
fabio-rizzo-01 Oct 21, 2025
351dc07
Merge remote-tracking branch 'private/main-30-06' into feature/s3-kms…
fabio-rizzo-01 Oct 21, 2025
b8dad37
Merge remote-tracking branch 'private/main-30-06' into feature/s3-kms…
fabio-rizzo-01 Oct 24, 2025
6feb766
Merge remote-tracking branch 'private/main-30-06' into feature/s3-kms…
fabio-rizzo-01 Nov 6, 2025
d0907a1
removed s3 key table changes. added new properties for AwsStorageConf…
fabio-rizzo-01 Nov 14, 2025
e026c78
Merge remote-tracking branch 'private/main-30-06' into feature/s3-kms…
fabio-rizzo-01 Nov 14, 2025
7c33b55
removed s3 key table changes. added new properties for AwsStorageConf…
fabio-rizzo-01 Nov 14, 2025
65a84fb
removed s3 key table changes. added new properties for AwsStorageConf…
fabio-rizzo-01 Nov 14, 2025
b8bacb9
adding support to use a kms key for s3 buckets (#17)
fabio-rizzo-01 Nov 14, 2025
8850df1
fixed merge issue
fabio-rizzo-01 Nov 14, 2025
83115fd
Merge branch 'main-30-06' into feature/s3-kms-key
fabio-rizzo-01 Nov 14, 2025
b604568
Feature/s3 kms key (#18)
fabio-rizzo-01 Nov 14, 2025
d85e4cf
fixed issue that broke integration tests and updated code based on PR…
fabio-rizzo-01 Nov 20, 2025
420c509
Merge remote-tracking branch 'private/feature/s3-kms-key' into featur…
fabio-rizzo-01 Nov 20, 2025
a7d6413
adding support to use a kms key for s3 buckets (#17)
fabio-rizzo-01 Nov 14, 2025
e4c8260
Feature/s3 kms key (#18)
fabio-rizzo-01 Nov 14, 2025
ab5cdf0
Merge remote-tracking branch 'private/main-30-06' into main-30-06
fabio-rizzo-01 Nov 20, 2025
7192c2f
Merge remote-tracking branch 'private/main-30-06' into feature/s3-kms…
fabio-rizzo-01 Nov 20, 2025
4b2dcd3
fixed issue that broke integration tests and updated code based on PR…
fabio-rizzo-01 Nov 20, 2025
5da097a
Feature/s3 kms key (#19)
fabio-rizzo-01 Nov 20, 2025
8bd5b74
adding support to use a kms key for s3 buckets (#17)
fabio-rizzo-01 Nov 14, 2025
8a1fe7e
Feature/s3 kms key (#18)
fabio-rizzo-01 Nov 14, 2025
5a60a69
Feature/s3 kms key (#19)
fabio-rizzo-01 Nov 20, 2025
2e882bf
Merge remote-tracking branch 'private/main-30-06' into main-30-06
fabio-rizzo-01 Nov 20, 2025
2addad4
Merge remote-tracking branch 'private/main-30-06' into feature/s3-kms…
fabio-rizzo-01 Nov 20, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ request adding CHANGELOG notes for breaking (!) changes and possibly other secti

### New Features

- Updated catalogs creation to include AWS kms key ,as an extra param in the storage config info, to be used in S3 data encryption
- Added a finer grained authorization model for UpdateTable requests. Existing privileges continue to work for granting UpdateTable, such as `TABLE_WRITE_PROPERTIES`.
However, you can now instead grant privileges just for specific operations, such as `TABLE_ADD_SNAPSHOT`
- Added a Management API endpoint to reset principal credentials, controlled by the `ENABLE_CREDENTIAL_RESET` (default: true) feature flag.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,7 @@ private StorageConfigInfo getStorageInfo(Map<String, String> internalProperties)
.setRoleArn(awsConfig.getRoleARN())
.setExternalId(awsConfig.getExternalId())
.setUserArn(awsConfig.getUserARN())
.setKmsKeyArn(awsConfig.getKmsKeyArn())
.setStorageType(StorageConfigInfo.StorageTypeEnum.S3)
.setAllowedLocations(awsConfig.getAllowedLocations())
.setRegion(awsConfig.getRegion())
Expand Down Expand Up @@ -308,6 +309,7 @@ public Builder setStorageConfigurationInfo(
AwsStorageConfigurationInfo.builder()
.allowedLocations(allowedLocations)
.roleARN(awsConfigModel.getRoleArn())
.kmsKeyArn(awsConfigModel.getKmsKeyArn())
.externalId(awsConfigModel.getExternalId())
.region(awsConfigModel.getRegion())
.endpoint(awsConfigModel.getEndpoint())
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ public AccessConfig getSubscopedCreds(
.roleSessionName("PolarisAwsCredentialsStorageIntegration")
.policy(
policyString(
storageConfig.getAwsPartition(),
storageConfig,
allowListOperation,
allowedReadLocations,
allowedWriteLocations)
Expand Down Expand Up @@ -163,9 +163,8 @@ private boolean shouldUseSts(AwsStorageConfigurationInfo storageConfig) {
* ListBucket privileges with no resources. This prevents us from sending an empty policy to AWS
* and just assuming the role with full privileges.
*/
// TODO - add KMS key access
private IamPolicy policyString(
String awsPartition,
AwsStorageConfigurationInfo storageConfigurationInfo,
boolean allowList,
Set<String> readLocations,
Set<String> writeLocations) {
Expand All @@ -178,7 +177,10 @@ private IamPolicy policyString(
Map<String, IamStatement.Builder> bucketListStatementBuilder = new HashMap<>();
Map<String, IamStatement.Builder> bucketGetLocationStatementBuilder = new HashMap<>();

String arnPrefix = arnPrefixForPartition(awsPartition);
String arnPrefix = arnPrefixForPartition(storageConfigurationInfo.getAwsPartition());
String kmsKeyArn = storageConfigurationInfo.getKmsKeyArn();
Copy link
Contributor

@dimas-b dimas-b Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can / will this work with MinIO KMS? Cf. https://github.com/minio/minio/blob/master/docs/kms/README.md

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will need to test that.

Copy link
Contributor

@dimas-b dimas-b Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine either way, but if it did not work with MinIO it would be good to mention AWS (only) in the PR description.

Also, I'm pretty sure we'll have to support KMS in MinIO at some point, so if current changes were reusable it would be great (but not required).

addKmsKeyPolicy(kmsKeyArn, policyBuilder);

Stream.concat(readLocations.stream(), writeLocations.stream())
.distinct()
.forEach(
Expand Down Expand Up @@ -242,6 +244,21 @@ private IamPolicy policyString(
return policyBuilder.addStatement(allowGetObjectStatementBuilder.build()).build();
}

private static void addKmsKeyPolicy(String kmsKeyArn, IamPolicy.Builder policyBuilder) {
if (kmsKeyArn != null) {
IamStatement.Builder allowKms =
IamStatement.builder()
.effect(IamEffect.ALLOW)
.addAction("kms:GenerateDataKeyWithoutPlaintext")
.addAction("kms:Encrypt")
.addAction("kms:DescribeKey")
.addAction("kms:Decrypt")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we give encrypt or decrypt based on the priviledge ? for example TABLE_READ should get decrypt and TABLE_READ_AND_WRITE should both encrypt and decrypt ?

.addAction("kms:GenerateDataKey");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add the kms encryption context to this statement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you suggesting to allow the user to specify a new param during catalog creation for the encryption context condition?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I mean for SSE, the encryption context is always the path to the file. In our IAM policy statement, we can specify that the key can be used only to en/decrypt files in the encrpytion context pattern we specify - e.g., see https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingKMSEncryption.html and https://docs.aws.amazon.com/kms/latest/developerguide/encrypt_context.html#encryption-context-authorization . The user doesn't need to specify this parameter - it's automatic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is needed, the policy for the key and S3 bucket is restrictive enough and adding more restriction would cause the GenerateDataKey operation during the assume role to probably fail.

This is an example of a policy generated while creating a table:

IamPolicy(version=2012-10-17, statements=[

IamStatement(effect=IamEffect(value=Allow), actions=[IamAction(value=s3:PutObject), IamAction(value=s3:DeleteObject)], resources=[IamResource(value=arn:aws:s3:::app-id-bucket/fns/t3/*)]), 

IamStatement(effect=IamEffect(value=Allow), actions=[IamAction(value=kms:GenerateDataKeyWithoutPlaintext), IamAction(value=kms:DescribeKey), IamAction(value=kms:Decrypt), IamAction(value=kms:GenerateDataKey), IamAction(value=kms:Encrypt)], resources=[IamResource(value=arn:aws:kms:us-east-1:*********:key/id)]), 

IamStatement(effect=IamEffect(value=Allow), actions=[IamAction(value=s3:ListBucket)], resources=[IamResource(value=arn:aws:s3:::app-id-bucket)], conditions=[IamCondition(operator=StringLike, key=s3:prefix, value=fns/t3/*)]), 

IamStatement(effect=IamEffect(value=Allow), actions=[IamAction(value=s3:GetBucketLocation)], resources=[IamResource(value=arn:aws:s3:::app-id-bucket)]), 

IamStatement(effect=IamEffect(value=Allow), actions=[IamAction(value=s3:GetObject), IamAction(value=s3:GetObjectVersion)], resources=[IamResource(value=arn:aws:s3:::app-id-bucket/fns/t3/*)])])


Also in https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingKMSEncryption.html#encryption-context
it says that the bucket path is added by default in the encryption context.

Checking the encryption context in CloudTrail for the GenerateDataKey request I can see:

"encryptionContext": {
"aws:s3:arn": "arn:aws:s3:::app-id-*****"
}

allowKms.addResource(IamResource.create(kmsKeyArn));
policyBuilder.addStatement(allowKms.build());
}
}

private static String arnPrefixForPartition(String awsPartition) {
return String.format("arn:%s:s3:::", awsPartition != null ? awsPartition : "aws");
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,10 @@ public String getFileIoImplClassName() {
@Nullable
public abstract String getRoleARN();

/** KMS Key ARN for server-side encryption, optional */
@Nullable
public abstract String getKmsKeyArn();

/** AWS external ID, optional */
@Nullable
public abstract String getExternalId();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,8 @@ public void testStsEndpoint() {

private static ImmutableAwsStorageConfigurationInfo.Builder newBuilder() {
return AwsStorageConfigurationInfo.builder()
.roleARN("arn:aws:iam::123456789012:role/polaris-test");
.roleARN("arn:aws:iam::123456789012:role/polaris-test")
.kmsKeyArn("arn:aws:kms:us-east-1:012345678901:key/444343245");
}

@Test
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -542,6 +542,74 @@ public void testGetSubscopedCredsInlinePolicyWithEmptyReadAndWrite() {
String.valueOf(EXPIRE_TIME.toEpochMilli()));
}

@Test
public void testGetSubscopedCredsInlinePolicyWithKmsKey() {
StsClient stsClient = Mockito.mock(StsClient.class);
String roleARN = "arn:aws:iam::012345678901:role/jdoe";
String externalId = "externalId";
String bucket = "bucket";
String warehouseKeyPrefix = "path/to/warehouse";
String firstPath = warehouseKeyPrefix + "/namespace/table";
String kmsKeyArn = "arn:aws:kms:us-east-1:012345678901:key/444343245";
Mockito.when(stsClient.assumeRole(Mockito.isA(AssumeRoleRequest.class)))
.thenAnswer(
invocation -> {
assertThat(invocation.getArguments()[0])
.isInstanceOf(AssumeRoleRequest.class)
.asInstanceOf(InstanceOfAssertFactories.type(AssumeRoleRequest.class))
.extracting(AssumeRoleRequest::policy)
.extracting(IamPolicy::fromJson)
.satisfies(
policy -> {
assertThat(policy)
.extracting(IamPolicy::statements)
.asInstanceOf(InstanceOfAssertFactories.list(IamStatement.class))
.hasSize(5)
.anySatisfy(
statement ->
assertThat(statement)
.returns(IamEffect.ALLOW, IamStatement::effect)
.returns(
List.of(
IamAction.create(
"kms:GenerateDataKeyWithoutPlaintext"),
IamAction.create("kms:Encrypt"),
IamAction.create("kms:DescribeKey"),
IamAction.create("kms:Decrypt"),
IamAction.create("kms:GenerateDataKey")),
IamStatement::actions)
.returns(
List.of(IamResource.create(kmsKeyArn)),
IamStatement::resources));
});
return ASSUME_ROLE_RESPONSE;
});
AccessConfig accessConfig =
new AwsCredentialsStorageIntegration(
AwsStorageConfigurationInfo.builder()
.addAllowedLocation(s3Path(bucket, warehouseKeyPrefix))
.roleARN(roleARN)
.externalId(externalId)
.region("us-east-1")
.kmsKeyArn(kmsKeyArn)
.build(),
stsClient)
.getSubscopedCreds(
EMPTY_REALM_CONFIG,
true,
Set.of(s3Path(bucket, firstPath)),
Set.of(s3Path(bucket, firstPath)),
Optional.empty());
assertThat(accessConfig.credentials())
.isNotEmpty()
.containsEntry(StorageAccessProperty.AWS_TOKEN.getPropertyName(), "sess")
.containsEntry(StorageAccessProperty.AWS_KEY_ID.getPropertyName(), "accessKey")
.containsEntry(StorageAccessProperty.AWS_SECRET_KEY.getPropertyName(), "secretKey")
.containsEntry(
StorageAccessProperty.AWS_SESSION_TOKEN_EXPIRES_AT_MS.getPropertyName(),
String.valueOf(EXPIRE_TIME.toEpochMilli()));
}

@ParameterizedTest
@ValueSource(strings = {AWS_PARTITION, "aws-cn", "aws-us-gov"})
public void testClientRegion(String awsPartition) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -292,6 +292,7 @@ public void testCatalogTypeDefaultsToInternal() {
AwsStorageConfigInfo.builder()
.setRoleArn("arn:aws:iam::012345678901:role/test-role")
.setExternalId("externalId")
.setKmsKeyArn("arn:aws:kms:us-east-1:012345678901:key/444343245")
.setUserArn("aws::a:user:arn")
.setStorageType(StorageConfigInfo.StorageTypeEnum.S3)
.setAllowedLocations(List.of(baseLocation))
Expand All @@ -305,6 +306,8 @@ public void testCatalogTypeDefaultsToInternal() {

Catalog catalog = catalogEntity.asCatalog(serviceIdentityProvider);
assertThat(catalog.getType()).isEqualTo(Catalog.TypeEnum.INTERNAL);
assertThat(((AwsStorageConfigInfo) catalog.getStorageConfigInfo()).getKmsKeyArn())
.isEqualTo("arn:aws:kms:us-east-1:012345678901:key/444343245");
}

@Test
Expand All @@ -313,6 +316,7 @@ public void testCatalogTypeExternalPreserved() {
AwsStorageConfigInfo storageConfigModel =
AwsStorageConfigInfo.builder()
.setRoleArn("arn:aws:iam::012345678901:role/test-role")
.setKmsKeyArn("arn:aws:kms:us-east-1:012345678901:key/444343245")
.setExternalId("externalId")
.setUserArn("aws::a:user:arn")
.setStorageType(StorageConfigInfo.StorageTypeEnum.S3)
Expand All @@ -328,6 +332,8 @@ public void testCatalogTypeExternalPreserved() {

Catalog catalog = catalogEntity.asCatalog(serviceIdentityProvider);
assertThat(catalog.getType()).isEqualTo(Catalog.TypeEnum.EXTERNAL);
assertThat(((AwsStorageConfigInfo) catalog.getStorageConfigInfo()).getKmsKeyArn())
.isEqualTo("arn:aws:kms:us-east-1:012345678901:key/444343245");
}

@Test
Expand Down
4 changes: 4 additions & 0 deletions spec/polaris-management-service.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1103,6 +1103,10 @@ components:
type: string
description: the aws user arn used to assume the aws role
example: "arn:aws:iam::123456789001:user/abc1-b-self1234"
kmsKeyArn:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By convention, adding REST API parameters needs to be discussed on the dev ML. Could you start a thread there and reference this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is already a thread for this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake... I did not see it 🤦

For ref: https://lists.apache.org/thread/to7lbdw1k6dym5c6gnk1nm32vqdw24qk

type: string
description: the aws kms key arn used to encrypt s3 data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by s3 data we mean data stored in s3 or iceberg data files ?
if its just data stored in s3, a part of iceberg metadata is generated by polaris itself, do we need to plumb the encryption to the fileIO polaris uses ?

if its just iceberg data files how are we making sure its not used against iceberg metadata files such as manifest lists ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the key is used by s3 to do encryption of any data in the bucket at rest, so as long as the IAM role has got the right entitlements on the key everything will work fine. I don't think anything needs to be changed on the polaris side.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am talking of setting these properties in the FileIO which polaris uses - https://iceberg.apache.org/docs/nightly/aws/#s3-server-side-encryption or are you relying on these catalog properties seeping automatically to the FileIO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow I might be missing something, how would you specify the kms key in Polaris without adding in the storage config, also we need to add the policy in there to be able to use right? thx

example: "arn:aws:kms::123456789001:key/01234578"
region:
type: string
description: the aws region where data is stored
Expand Down