Skip to content

[Format] Specify FIXED_SIZE_LIST Logical type #430

@asfimport

Description

@asfimport

Replicated from mailing list

Arrow recently introduced FixedShapeTensor and VariableShapeTensor canonical extension types that use FixedSizeList and StructArray(List, FixedSizeList) as storage respectfully. These are targeted at machine learning and scientific applications that deal with large datasets and would benefit from using Parquet as on disk storage.

However currently FixedSizeList is stored as List in Parquet which adds significant conversion overhead when reading and writing as discussed here. It would therefore be beneficial to introduce a FIXED_SIZE_LIST logical type to Parquet.

Reporter: Rok Mihevc

PRs and other links:

Note: This issue was originally created as PARQUET-2474. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions