-
Notifications
You must be signed in to change notification settings - Fork 5.5k
fix: Misleading error message caused by unsupported compresssion codec #26262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Reviewer's GuideThis PR enhances error handling for unsupported compression codecs in Iceberg table creation by validating codec compatibility before accessing Optional values, replacing cryptic NoSuchElementExceptions with descriptive PrestoExceptions, and supplements the change with comprehensive unit tests. Sequence diagram for improved error handling during Iceberg table creationsequenceDiagram
participant "User"
participant "IcebergUtil.populateTableProperties()"
participant "HiveCompressionCodec"
participant "PrestoException"
User->>"IcebergUtil.populateTableProperties()": Create Iceberg table with compression codec
"IcebergUtil.populateTableProperties()"->>"HiveCompressionCodec": Check if codec is supported for file format
alt Codec supported
"IcebergUtil.populateTableProperties()"->>"HiveCompressionCodec": Get compression codec value
"IcebergUtil.populateTableProperties()"-->>User: Table created successfully
else Codec not supported
"IcebergUtil.populateTableProperties()"->>"PrestoException": Throw descriptive error
"IcebergUtil.populateTableProperties()"-->>User: Receive descriptive error message
end
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `presto-iceberg/src/test/java/com/facebook/presto/iceberg/IcebergDistributedSmokeTestBase.java:1109-1107` </location>
<code_context>
assertEquals(512, getTargetSplitSize(0, 512).toBytes());
}
+
+ @Test
+ public void testCompressionCodecSupportForParquet()
+ {
+ assertThat(HiveCompressionCodec.NONE.isSupportedStorageFormat(HiveStorageFormat.PARQUET)).isTrue();
+ assertThat(HiveCompressionCodec.SNAPPY.isSupportedStorageFormat(HiveStorageFormat.PARQUET)).isTrue();
+ assertThat(HiveCompressionCodec.GZIP.isSupportedStorageFormat(HiveStorageFormat.PARQUET)).isTrue();
+
+ assertThat(HiveCompressionCodec.LZ4.isSupportedStorageFormat(HiveStorageFormat.PARQUET)).isFalse();
+ assertThat(HiveCompressionCodec.ZSTD.isSupportedStorageFormat(HiveStorageFormat.PARQUET)).isFalse();
+ }
+
+ @Test
</code_context>
<issue_to_address>
**suggestion (testing):** Missing test for case-insensitive codec values.
Add tests to confirm that codec values are correctly recognized regardless of case sensitivity.
Suggested implementation:
```java
@Test
public void testCompressionCodecSupportForParquet()
{
assertThat(HiveCompressionCodec.NONE.isSupportedStorageFormat(HiveStorageFormat.PARQUET)).isTrue();
assertThat(HiveCompressionCodec.SNAPPY.isSupportedStorageFormat(HiveStorageFormat.PARQUET)).isTrue();
assertThat(HiveCompressionCodec.GZIP.isSupportedStorageFormat(HiveStorageFormat.PARQUET)).isTrue();
assertThat(HiveCompressionCodec.LZ4.isSupportedStorageFormat(HiveStorageFormat.PARQUET)).isFalse();
assertThat(HiveCompressionCodec.ZSTD.isSupportedStorageFormat(HiveStorageFormat.PARQUET)).isFalse();
}
@Test
public void testCompressionCodecSupportForParquetCaseInsensitive()
{
// Case-insensitive checks for supported codecs
assertThat(HiveCompressionCodec.valueOf("none".toUpperCase()).isSupportedStorageFormat(HiveStorageFormat.PARQUET)).isTrue();
assertThat(HiveCompressionCodec.valueOf("snappy".toUpperCase()).isSupportedStorageFormat(HiveStorageFormat.PARQUET)).isTrue();
assertThat(HiveCompressionCodec.valueOf("gzip".toUpperCase()).isSupportedStorageFormat(HiveStorageFormat.PARQUET)).isTrue();
assertThat(HiveCompressionCodec.valueOf("NONE")).isEqualTo(HiveCompressionCodec.NONE);
assertThat(HiveCompressionCodec.valueOf("SnApPy".toUpperCase())).isEqualTo(HiveCompressionCodec.SNAPPY);
assertThat(HiveCompressionCodec.valueOf("GZIP")).isEqualTo(HiveCompressionCodec.GZIP);
// Case-insensitive checks for unsupported codecs
assertThat(HiveCompressionCodec.valueOf("lz4".toUpperCase()).isSupportedStorageFormat(HiveStorageFormat.PARQUET)).isFalse();
assertThat(HiveCompressionCodec.valueOf("zstd".toUpperCase()).isSupportedStorageFormat(HiveStorageFormat.PARQUET)).isFalse;
}
```
If `HiveCompressionCodec.valueOf()` does not handle case-insensitivity, you may need to implement a helper method or use a custom lookup to convert codec names to enum values in a case-insensitive way. Adjust the test accordingly if this is the case.
</issue_to_address>
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
assertEquals(512, getTargetSplitSize(0, 512).toBytes()); | ||
} | ||
|
||
@Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using a data provider for these to improve readability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aaneja Thanks.
if (compressionCodec.isSupportedStorageFormat(HiveStorageFormat.PARQUET)) { | ||
propertiesBuilder.put(PARQUET_COMPRESSION, compressionCodec.getParquetCompressionCodec().get().toString()); | ||
} | ||
else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifying else
explicitly here is redundant since we break
right after. Consider inverting the if
to make it a check-and-fail-if-not-supported flow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
} | ||
|
||
@Test | ||
public void testUnsupportedCompressionCodec() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using a data provider for these to improve readability
Session sessionWithLz4Orc = Session.builder(getSession()) | ||
.setCatalogSessionProperty("iceberg", "compression_codec", "LZ4") | ||
.build(); | ||
assertQueryFails(sessionWithLz4Orc, | ||
"CREATE TABLE test_unsupported_compression (x int) WITH (format = 'ORC')", | ||
"Compression codec LZ4 is not supported for ORC format"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I see, based on our current code logic, this statement will succeed. It seems a little weird but the compression codec will be set to NONE
for the target Iceberg table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, never mind. I just found you have invoke compressionCodec.isSupportedStorageFormat(HiveStorageFormat.ORC)
to check and throw the exception.
e866e90
to
4fb0995
Compare
@aaneja @hantangwangd Could you please have another look. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this fix, lgtm!
Description
Problem:
When setting
iceberg.compression-codec=ZSTD
(or LZ4) in iceberg.propertiesand attempt to create a iceberg table with default file format, the system throws a cryptic error:
"NoSuchElementException: No value present" at IcebergUtil.populateTableProperties().
13:55:39 {"level":"error","benchmark_stage_id":"cleanup_sf100","query_file":"create-table/customer.sql","query_index":0,"cold_run":true,"sequence_no":0,"info_url":"https://xxxxxx.prestodb.dev/ui/query.html?20251009_125539_00038_24pui","query_id":"20251009_125539_00038_24pui","query_error":{"message":"No value present","error_code":65536,"error_name":"GENERIC_INTERNAL_ERROR","error_type":"INTERNAL_ERROR","retriable":false,"failure_info":{"type":"java.util.NoSuchElementException","message":"No value present","suppressed":[],"stack":["java.base/java.util.Optional.get(Optional.java:143)","com.facebook.presto.iceberg.IcebergUtil.populateTableProperties(IcebergUtil.java:1178)","com.facebook.presto.iceberg.IcebergHiveMetadata.beginCreateTable(IcebergHiveMetadata.java:385)","com.facebook.presto.iceberg.IcebergAbstractMetadata.createTable(IcebergAbstractMetadata.java:503)","com.facebook.presto.spi.connector.classloader.ClassLoaderSafeConnectorMetadata.createTable(ClassLoaderSafeConnectorMetadata.java:374)","com.facebook.presto.metadata.MetadataManager.createTable(MetadataManager.java:675)","com.facebook.presto.execution.CreateTableTask.internalExecute(CreateTableTask.java:229)","com.facebook.presto.execution.CreateTableTask.execute(CreateTableTask.java:96)","com.facebook.presto.execution.CreateTableTask.execute(CreateTableTask.java:78)","com.facebook.presto.execution.DDLDefinitionExecution.executeTask(DDLDefinitionExecution.java:67)","com.facebook.presto.execution.DataDefinitionExecution.start(DataDefinitionExecution.java:217)","com.facebook.presto.$gen.Presto_0_296_0d532be__0_296_202510071020_0d532be3____20251009_124343_1.run(Unknown Source)","com.facebook.presto.execution.SqlQueryManager.createQuery(SqlQueryManager.java:326)","com.facebook.presto.dispatcher.LocalDispatchQuery.lambda$startExecution$8(LocalDispatchQuery.java:210)","java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)","..."]}},"expected_row_count":1,"start_time":"2025-10-09T12:55:39Z","finish_time":"2025-10-09T12:55:39Z","duration_in_seconds":0.112960251,"time":"2025-10-09T12:55:39Z","message":"execution failed"}
The root case of this error message is
ZSTD
is a unsupported compression codec, and when executing this linepropertiesBuilder.put(PARQUET_COMPRESSION, getCompressionCodec(session).getParquetCompressionCodec().get().toString());
,For ZSTD and LZ4, getParquetCompressionCodec() returns Optional.empty() and calling .get() on empty Optional throws the confusing exception
Solution:
Added validation in IcebergUtil.populateTableProperties() to check codec
compatibility before calling .get() on Optional.
And also add unit tests to validate the compression codec.
Motivation and Context
Impact
Test Plan
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.
If release note is NOT required, use: