BUG: DataFrame.rank does not return EA types when original type was an EADtype #62189

sharkipelago · 2025-08-25T17:01:21Z

closes BUG: DataFrame.rank does not return EA types when original type was an EADtype #52829
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

I rewrote the method using _mgr.apply but still struggling to figure out why 2 of test_rank.py test cases are failing. Any help or tips are appreciated! They are as follows:

self = <pandas.tests.frame.methods.test_rank.TestRank object at 0x77da84fae490>

    def test_rank2(self):
        df = DataFrame([[1, 3, 2], [1, 2, 3]])
        expected = DataFrame([[1.0, 3.0, 2.0], [1, 2, 3]]) / 3.0
        result = df.rank(1, pct=True)
        tm.assert_frame_equal(result, expected)

        df = DataFrame([[1, 3, 2], [1, 2, 3]])
        expected = df.rank(0) / 2.0
        result = df.rank(0, pct=True)
        tm.assert_frame_equal(result, expected)

        df = DataFrame([["b", "c", "a"], ["a", "c", "b"]])
        expected = DataFrame([[2.0, 3.0, 1.0], [1, 3, 2]])
        result = df.rank(1, numeric_only=False)
>       tm.assert_frame_equal(result, expected)

pandas/tests/frame/methods/test_rank.py:84:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pandas/_libs/testing.pyx:53: in pandas._libs.testing.assert_almost_equal
    cpdef assert_almost_equal(a, b,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   raise_assert_detail(
E   AssertionError: DataFrame.iloc[:, 1] (column name="1") are different
E
E   DataFrame.iloc[:, 1] (column name="1") values are different (100.0 %)
E   [index]: [0, 1]
E   [left]:  [1.5, 1.5]
E   [right]: [3.0, 3.0]
E   At positional index 0, first diff: 1.5 != 3.0

pandas/_libs/testing.pyx:171: AssertionError

self = <pandas.tests.frame.methods.test_rank.TestRank object at 0x77da84faf290>
float_string_frame =                A         B         C         D  foo                   datetime       timedelta
foo_0   0.189053 -0.522...5.169587 1 days 00:00:01
foo_29 -0.967681  1.678419  0.765355  0.045808  bar 2025-08-25 12:57:35.169587 1 days 00:00:01

    def test_rank_mixed_frame(self, float_string_frame):
        float_string_frame["datetime"] = datetime.now()
        float_string_frame["timedelta"] = timedelta(days=1, seconds=1)

        float_string_frame.rank(numeric_only=False)
>       with pytest.raises(TypeError, match="not supported between instances of"):
E       Failed: DID NOT RAISE <class 'TypeError'>

jbrockmendel · 2025-08-25T17:59:27Z

doc/source/whatsnew/v3.0.0.rst

@@ -204,6 +204,7 @@ Other enhancements
 - :meth:`.DataFrameGroupBy.transform`, :meth:`.SeriesGroupBy.transform`, :meth:`.DataFrameGroupBy.agg`, :meth:`.SeriesGroupBy.agg`, :meth:`.SeriesGroupBy.apply`, :meth:`.DataFrameGroupBy.apply` now support ``kurt`` (:issue:`40139`)
 - :meth:`DataFrame.apply` supports using third-party execution engines like the Bodo.ai JIT compiler (:issue:`60668`)
 - :meth:`DataFrame.iloc` and :meth:`Series.iloc` now support boolean masks in ``__getitem__`` for more consistent indexing behavior (:issue:`60994`)
+- :meth:`DataFrame.rank` now uses internal ``_mgr.apply`` and preserves the dtype for extension arrays (:issue:`52829`)


Don’t need to mention mgr, just the user-facing bit

jbrockmendel · 2025-08-25T18:15:16Z

In the axis=1 case you need to transpose the whole dataframe, not block-by-block

…n-array

jbrockmendel · 2025-08-26T17:32:19Z

doc/source/whatsnew/v3.0.0.rst

@@ -204,6 +204,7 @@ Other enhancements
 - :meth:`.DataFrameGroupBy.transform`, :meth:`.SeriesGroupBy.transform`, :meth:`.DataFrameGroupBy.agg`, :meth:`.SeriesGroupBy.agg`, :meth:`.SeriesGroupBy.apply`, :meth:`.DataFrameGroupBy.apply` now support ``kurt`` (:issue:`40139`)
 - :meth:`DataFrame.apply` supports using third-party execution engines like the Bodo.ai JIT compiler (:issue:`60668`)
 - :meth:`DataFrame.iloc` and :meth:`Series.iloc` now support boolean masks in ``__getitem__`` for more consistent indexing behavior (:issue:`60994`)
+- :meth:`DataFrame.rank` now preserves the dtype for extension arrays (:issue:`52829`)


should be dtype_backend, not dtype?

…n-array

sharkipelago added 3 commits August 25, 2025 12:31

using _mgr apply with 2 failing tests

98a70df

transposed blocks to keep axis_int parameter intact

af9f3de

added rst

00ef2ea

jbrockmendel reviewed Aug 25, 2025

View reviewed changes

sharkipelago added 6 commits August 25, 2025 16:53

updated rst

f78ffa5

Merge remote-tracking branch 'upstream/main' into frame-rank-extensio…

5124513

…n-array

dataframe level transpose

178f4e3

Merge remote-tracking branch 'upstream/main' into frame-rank-extensio…

3a94c7f

…n-array

removed redundant ndim checks

94893f0

added pytest skips if no pyarrow module

7ee79ef

jbrockmendel reviewed Aug 26, 2025

View reviewed changes

sharkipelago added 2 commits August 26, 2025 13:59

Merge remote-tracking branch 'upstream/main' into frame-rank-extensio…

601dc39

…n-array

corrected to dtype_backend

2e7ca27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: DataFrame.rank does not return EA types when original type was an EADtype #62189

BUG: DataFrame.rank does not return EA types when original type was an EADtype #62189

sharkipelago commented Aug 25, 2025 •

edited

Loading

Uh oh!

jbrockmendel Aug 25, 2025

Uh oh!

jbrockmendel commented Aug 25, 2025

Uh oh!

jbrockmendel Aug 26, 2025

Uh oh!

Uh oh!

Uh oh!

BUG: DataFrame.rank does not return EA types when original type was an EADtype #62189

Are you sure you want to change the base?

BUG: DataFrame.rank does not return EA types when original type was an EADtype #62189

Conversation

sharkipelago commented Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jbrockmendel Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

jbrockmendel commented Aug 25, 2025

Uh oh!

jbrockmendel Aug 26, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sharkipelago commented Aug 25, 2025 •

edited

Loading