Skip to content

_metadata properties do not work with pyjanitor #1473

@raffaem

Description

@raffaem

Brief Description

_matadata original properties are not pased to pyjanitor manipulation results

System Information

  • Operating system: Windows
  • OS details (optional): 11
  • Python version (required): 3.13

Minimally Reproducible Code

import pandas as pd
import janitor # noqa: F401
import pandas_flavor as pf

# See: https://pandas.pydata.org/pandas-docs/stable/development/extending.html#define-original-properties
class MyDataFrame(pd.DataFrame):

    # normal properties
    _metadata = ["myvar"]

    @property
    def _constructor(self):
        return MyDataFrame

@pf.register_dataframe_method
def regvar(self):
    obj = MyDataFrame(self)
    obj.myvar = 2
    return obj

@pf.register_dataframe_method
def printvar(self):
    print(self.myvar)
    return self

df = pd.DataFrame(
     {
         "Year": [1999, 2000, 2004, 1999, 2004],
         "Taxon": [
             "Saccharina",
             "Saccharina",
             "Saccharina",
             "Agarum",
             "Agarum",
         ],
         "Abundance": [4, 5, 2, 1, 8],
     }
 )
 
df2 = df.regvar().query("Taxon=='Saccharina'").printvar()

index = pd.Index(range(1999,2005),name='Year')
df2 = df.regvar().complete(index, "Taxon", sort=True).printvar()

Error Messages

First call with built-in pandas method correctly returns 2.

Second call with pyjanitor method returns:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_4412\627945022.py in ?()
     39 
     40 df2 = df.regvar().query("Taxon=='Saccharina'").printvar()
     41 
     42 index = pd.Index(range(1999,2005),name='Year')
---> 43 df2 = df.regvar().complete(index, "Taxon", sort=True).printvar()

c:\Users\raffaele\venvs\base\Lib\site-packages\pandas_flavor\register.py in ?(self, *args, **kwargs)
    160                     object: The result of calling of the method.
    161                 """
    162                 global method_call_ctx_factory
    163                 if method_call_ctx_factory is None:
--> 164                     return method(self._obj, *args, **kwargs)
    165 
    166                 return handle_pandas_extension_call(
    167                     method, method_signature, self._obj, args, kwargs

~\AppData\Local\Temp\ipykernel_4412\627945022.py in ?(self)
     21 @pf.register_dataframe_method
     22 def printvar(self):
---> 23     print(self.myvar)
     24     return self

c:\Users\raffaele\venvs\base\Lib\site-packages\pandas\core\generic.py in ?(self, name)
   6295             and name not in self._accessors
   6296             and self._info_axis._can_hold_identifiers_and_holds_name(name)
   6297         ):
   6298             return self[name]
-> 6299         return object.__getattribute__(self, name)

AttributeError: 'DataFrame' object has no attribute 'myvar'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions