Skip to content

Conversation

sharkipelago
Copy link
Contributor

@sharkipelago sharkipelago commented Aug 25, 2025

I rewrote the method using _mgr.apply but still struggling to figure out why 2 of test_rank.py test cases are failing. Any help or tips are appreciated! They are as follows:

self = <pandas.tests.frame.methods.test_rank.TestRank object at 0x77da84fae490>

    def test_rank2(self):
        df = DataFrame([[1, 3, 2], [1, 2, 3]])
        expected = DataFrame([[1.0, 3.0, 2.0], [1, 2, 3]]) / 3.0
        result = df.rank(1, pct=True)
        tm.assert_frame_equal(result, expected)

        df = DataFrame([[1, 3, 2], [1, 2, 3]])
        expected = df.rank(0) / 2.0
        result = df.rank(0, pct=True)
        tm.assert_frame_equal(result, expected)

        df = DataFrame([["b", "c", "a"], ["a", "c", "b"]])
        expected = DataFrame([[2.0, 3.0, 1.0], [1, 3, 2]])
        result = df.rank(1, numeric_only=False)
>       tm.assert_frame_equal(result, expected)

pandas/tests/frame/methods/test_rank.py:84:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pandas/_libs/testing.pyx:53: in pandas._libs.testing.assert_almost_equal
    cpdef assert_almost_equal(a, b,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   raise_assert_detail(
E   AssertionError: DataFrame.iloc[:, 1] (column name="1") are different
E
E   DataFrame.iloc[:, 1] (column name="1") values are different (100.0 %)
E   [index]: [0, 1]
E   [left]:  [1.5, 1.5]
E   [right]: [3.0, 3.0]
E   At positional index 0, first diff: 1.5 != 3.0

pandas/_libs/testing.pyx:171: AssertionError
self = <pandas.tests.frame.methods.test_rank.TestRank object at 0x77da84faf290>
float_string_frame =                A         B         C         D  foo                   datetime       timedelta
foo_0   0.189053 -0.522...5.169587 1 days 00:00:01
foo_29 -0.967681  1.678419  0.765355  0.045808  bar 2025-08-25 12:57:35.169587 1 days 00:00:01

    def test_rank_mixed_frame(self, float_string_frame):
        float_string_frame["datetime"] = datetime.now()
        float_string_frame["timedelta"] = timedelta(days=1, seconds=1)

        float_string_frame.rank(numeric_only=False)
>       with pytest.raises(TypeError, match="not supported between instances of"):
E       Failed: DID NOT RAISE <class 'TypeError'>

@@ -204,6 +204,7 @@ Other enhancements
- :meth:`.DataFrameGroupBy.transform`, :meth:`.SeriesGroupBy.transform`, :meth:`.DataFrameGroupBy.agg`, :meth:`.SeriesGroupBy.agg`, :meth:`.SeriesGroupBy.apply`, :meth:`.DataFrameGroupBy.apply` now support ``kurt`` (:issue:`40139`)
- :meth:`DataFrame.apply` supports using third-party execution engines like the Bodo.ai JIT compiler (:issue:`60668`)
- :meth:`DataFrame.iloc` and :meth:`Series.iloc` now support boolean masks in ``__getitem__`` for more consistent indexing behavior (:issue:`60994`)
- :meth:`DataFrame.rank` now uses internal ``_mgr.apply`` and preserves the dtype for extension arrays (:issue:`52829`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don’t need to mention mgr, just the user-facing bit

@jbrockmendel
Copy link
Member

In the axis=1 case you need to transpose the whole dataframe, not block-by-block

@@ -204,6 +204,7 @@ Other enhancements
- :meth:`.DataFrameGroupBy.transform`, :meth:`.SeriesGroupBy.transform`, :meth:`.DataFrameGroupBy.agg`, :meth:`.SeriesGroupBy.agg`, :meth:`.SeriesGroupBy.apply`, :meth:`.DataFrameGroupBy.apply` now support ``kurt`` (:issue:`40139`)
- :meth:`DataFrame.apply` supports using third-party execution engines like the Bodo.ai JIT compiler (:issue:`60668`)
- :meth:`DataFrame.iloc` and :meth:`Series.iloc` now support boolean masks in ``__getitem__`` for more consistent indexing behavior (:issue:`60994`)
- :meth:`DataFrame.rank` now preserves the dtype for extension arrays (:issue:`52829`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be dtype_backend, not dtype?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: DataFrame.rank does not return EA types when original type was an EADtype
2 participants