ENH: Include line number and number of fields when read_csv() callable with engine="python"
raises ParserWarning
#61974
+21
−9
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.Description of the change
read_csv()
currently provides the description of an invalid row(expected_columns, actual_columns, number, text) when the row has too many elements whereengine="pyarrow"
, but the callable can only include the contents of the row whenengine="python"
.(For more details on pyarrow.csv.InvalidRow, see pyarrow documentation)
This PR proposes to additionally pass
expected_columns
,actual_columns
androw
whenon_bad_lines
is a callable andengine="python"
, so that users can desribe the invalid row more in detail.The order of the arguments has been aligned with
pyarrow
.