-
Notifications
You must be signed in to change notification settings - Fork 721
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
Fail to insert 50 partitions with ICEBERG_TOO_MANY_OPEN_PARTITIONS when using awswrangler.athena.to_iceberg
How to Reproduce
Use this file debug_chunk_df.csv
in this script
pandas_df = pandas.read_csv('/tmp/debug_chunk_df.csv')
awswrangler.athena.to_iceberg(df=pandas_df, database=database, table=table, mode="append", temp_path=temp_path)
to insert to this Athena table
CREATE TABLE iceberg_db.account_uploads (
accountnumber string,
firstname string,
lastname string,
address string,
city string,
state string,
zipcode bigint,
balance double,
originalcreditor string,
dateofbirth date,
ssn string)
PARTITIONED BY (`state`, day(`dateofbirth`))
LOCATION 's3://dev-sftp-stack/iceberg/account_uploads'
TBLPROPERTIES (
'table_type'='iceberg',
'write_compression'='snappy',
'format'='parquet',
'optimize_rewrite_delete_file_threshold'='10'
);
will result in
{QueryFailed}QueryFailed("ICEBERG_TOO_MANY_OPEN_PARTITIONS: Exceeded limit of 100 open writers for partitions. If a data manifest file was ...ou may need to manually clean the data from locations specified in the manifest. Athena will not delete data in your account.")
even thought the input contains only 50 distinct combination of (state, day(dateofbirth))
Expected behavior
Should be uploaded without error
Your project
No response
Screenshots
OS
MacOS Sequoia 15.6
Python version
3.11.13
AWS SDK for pandas version
3.13.0
Additional context
Would dateofbirth date data type be causing the issue?
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working