Skip to content

Run pandas test suite a second time with fallback disabled and xfail all failing tests #19693

@vyasr

Description

@vyasr

With #18719 done we now run the pandas test suite with a strict xfail setting so that 1) we do not regress our support, and 2) we immediately detect when seemingly unrelated improvements to cudf result in improved pandas compatibility. Our immediate next focus is #18659, passing the entire pandas test suite by removing all of those xfails. That goal should be achievable as long as we allow for fallback. As discussed in #17458, passing the entire pandas test suite with zero fallback is probably a non-goal for us due to the massive scope of the pandas API and the many use cases that are not worthwhile for cudf to accelerate. However, we do want to ensure that 1) we also do not regress the performance of cudf.pandas by making changes that increase fallback, and 2) that we are fully aware of and can document when we accelerate pandas and when we fall back to host execution.

To that end, we should add a second run of the pandas test suite to our CI that enables failure on fallback and xfails all tests where we know that fallback currently occurs. That can be added to our CI as a second job that runs alongside the current cudf.pandas test job.

Metadata

Metadata

Assignees

Labels

PythonAffects Python cuDF API.improvementImprovement / enhancement to an existing functiontestsUnit testing for project

Type

No type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions