Skip to content

ENH: Include line number and number of fields when read_csv() callable with engine="python" raises ParserWarning #61974

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

sanggon6107
Copy link
Contributor

Description of the change

read_csv() currently provides the description of an invalid row(expected_columns, actual_columns, number, text) when the row has too many elements where engine="pyarrow", but the callable can only include the contents of the row when engine="python".

(For more details on pyarrow.csv.InvalidRow, see pyarrow documentation)

This PR proposes to additionally pass expected_columns, actual_columns and row when on_bad_lines is a callable and engine="python", so that users can desribe the invalid row more in detail.

The order of the arguments has been aligned with pyarrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: Include line number and number of fields when read_csv() callable raises ParserWarning
1 participant