Skip to content

Provide some ability to bail out of nested ArrayBuilders #9038

@kszlim

Description

@kszlim

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
When building a nested array using something like a StructBuilder, if you're halfway through a row/record, but then find that you can't continue building the array (perhaps you're parsing a streaming format into arrow), there's no easy way to either mark the whole row as null or the rest of the fields in the row as null easily. You have to continue to traverse your entire schema and push some dummy/null data into your child builders (recursively), and then at the end either append null or finish your row. This is very unergonomic, as you need special handling for this case in your entire parser.

Describe the solution you'd like
Discussed in discord, but something like a truncate_row api. There are a few questions to answer, should it somehow null out all the rest of the fields in your row, how does it interact with nested data (what if you're partway through some nested data at the moment/your child builders are in a halfway state)?

Is there a clean way to just ignore the state of your child builders and just dump that row?

Describe alternatives you've considered
Perhaps a cleaner option is providing an option to turn off length validation, finish your builder, and then slice off the last partial row?

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementAny new improvement worthy of a entry in the changelog

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions