Skip to content

ICEBERG_TOO_MANY_OPEN_PARTITIONS awswrangler.athena.to_iceberg #3222

@TRANTANKHOA

Description

@TRANTANKHOA

Describe the bug

Fail to insert 50 partitions with ICEBERG_TOO_MANY_OPEN_PARTITIONS when using awswrangler.athena.to_iceberg

How to Reproduce

Use this file debug_chunk_df.csv
in this script

pandas_df = pandas.read_csv('/tmp/debug_chunk_df.csv')
awswrangler.athena.to_iceberg(df=pandas_df, database=database, table=table, mode="append", temp_path=temp_path)

to insert to this Athena table

CREATE TABLE iceberg_db.account_uploads (
  accountnumber string,
  firstname string,
  lastname string,
  address string,
  city string,
  state string,
  zipcode bigint,
  balance double,
  originalcreditor string,
  dateofbirth date,
  ssn string)
PARTITIONED BY (`state`, day(`dateofbirth`))
LOCATION 's3://dev-sftp-stack/iceberg/account_uploads'
TBLPROPERTIES (
  'table_type'='iceberg',
  'write_compression'='snappy',
  'format'='parquet',
  'optimize_rewrite_delete_file_threshold'='10'
);

will result in

{QueryFailed}QueryFailed("ICEBERG_TOO_MANY_OPEN_PARTITIONS: Exceeded limit of 100 open writers for partitions. If a data manifest file was ...ou may need to manually clean the data from locations specified in the manifest. Athena will not delete data in your account.")

even thought the input contains only 50 distinct combination of (state, day(dateofbirth))

Expected behavior

Should be uploaded without error

Your project

No response

Screenshots

Image Image

OS

MacOS Sequoia 15.6

Python version

3.11.13

AWS SDK for pandas version

3.13.0

Additional context

Would dateofbirth date data type be causing the issue?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions