Skip to content

Improvements to arguments, types with stubtest #1294

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

loicdiridollou
Copy link
Member

More improvements with the stubtest flagging some drift with pandas.

One point that was raised is the handling of deprecated items, maybe the other possibility than purely removing it from the stubs is to force the stubs to adopt the default value so that whatever the user is doing it won't allow any other behavior.
We need to see how much we can use stubtest, I don't see it being used in CI at the moment or anytime soon considering the mountain of work that it raises (mostly correctly but I have seen a few places where it is flagging things that are fine), there is also the problem of the no_default that pandas uses abundantly and which is hard to replicate in the stubs.

  • Closes #xxxx (Replace xxxx with the Github issue number)
  • Tests added: Please use assert_type() to assert the type of any return value

@@ -1628,24 +1643,7 @@ class DataFrame(NDFrame, OpsMixin, _GetItemHack):
def isin(self, values: Iterable | Series | DataFrame | dict) -> Self: ...
@property
def plot(self) -> PlotAccessor: ...
def hist(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But hist_frame is in a private module, and we should remove that module, so revert this change.

@@ -308,6 +309,21 @@ else:
@overload
def __getitem__(self, key: Hashable) -> Series: ...

AstypeArgExt: TypeAlias = (
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved them out of the DataFrame class

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make them private, i.e., renameAstypeArgExt to _AstypeArgExt

@@ -76,10 +76,9 @@ class MultiIndex(Index):
@property
def codes(self): ...
def set_codes(self, codes, *, level=..., verify_integrity: bool = ...): ...
def copy( # pyright: ignore[reportIncompatibleMethodOverride] # pyrefly: ignore
self, names=..., deep: bool = ...
def copy( # type: ignore[override] # pyright: ignore[reportIncompatibleMethodOverride] # pyrefly: ignore
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need overwriting of the parent class (as no names exist).

@@ -136,7 +135,7 @@ class MultiIndex(Index):
self, indices, axis: int = ..., allow_fill: bool = ..., fill_value=..., **kwargs
): ...
def append(self, other): ...
def argsort(self, *args, **kwargs): ...
def argsort(self, *args, na_position: NaPosition = ..., **kwargs): ...
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not documented but at runtime it exists, not sure why it is not documented though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you create an issue in pandas about this?

@@ -53,7 +52,6 @@ class PeriodIndex(DatetimeIndexOpsMixin[pd.Period], PeriodIndexFieldOps):
def __rsub__( # pyright: ignore[reportIncompatibleMethodOverride]
self, other: NaTType
) -> NaTType: ...
def __array__(self, dtype=...) -> np.ndarray: ...
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using parent definition.

@@ -435,3 +435,23 @@ class PlotAccessor:
) -> npt.NDArray[np.object_]: ...

density = kde

def hist_frame(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defining it here for pd.DataFrame.hist

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since plotting._core is not documented, we should remove it, and then revert this change.

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks. a number of things that stubgen picks up we still need to follow the docs on

@@ -308,6 +309,21 @@ else:
@overload
def __getitem__(self, key: Hashable) -> Series: ...

AstypeArgExt: TypeAlias = (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make them private, i.e., renameAstypeArgExt to _AstypeArgExt

"datetime64[ns]",
]
)
AstypeArgExtList: TypeAlias = AstypeArgExt | list[AstypeArgExt]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
AstypeArgExtList: TypeAlias = AstypeArgExt | list[AstypeArgExt]
_AstypeArgExtList: TypeAlias = _AstypeArgExt | list[_AstypeArgExt]

make private

Comment on lines 1330 to 1344
def stack(
self, level: IndexLabel = ..., dropna: _bool = ..., sort: _bool = ...
self,
level: IndexLabel = ...,
dropna: _bool = ...,
sort: _bool = ...,
future_stack: Literal[False] = ...,
) -> Self | Series: ...
@overload
def stack(
self, level: IndexLabel = ..., future_stack: _bool = ...
self,
level: IndexLabel = ...,
dropna: _NoDefaultDoNotUse = ...,
sort: _NoDefaultDoNotUse = ...,
future_stack: Literal[True] = ...,
) -> Self | Series: ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overload with future_stack: Literal[True] should come first, and have future_stack: Literal[True], and not have the dropna and sort arguments in the list of args.

self,
func: AggFuncTypeBase | AggFuncTypeDictSeries,
func: AggFuncTypeBase | AggFuncTypeDictSeries = ...,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not valid. If you don't specify the value of func, an exception will be raised. Please revert.

def count(
self, axis: Axis = ..., level: None = ..., numeric_only: _bool = ...
) -> Series: ...
def count(self, axis: Axis = ..., numeric_only: _bool = ...) -> Self: ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be Series[int]

@@ -136,7 +135,7 @@ class MultiIndex(Index):
self, indices, axis: int = ..., allow_fill: bool = ..., fill_value=..., **kwargs
): ...
def append(self, other): ...
def argsort(self, *args, **kwargs): ...
def argsort(self, *args, na_position: NaPosition = ..., **kwargs): ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you create an issue in pandas about this?

@@ -938,7 +937,7 @@ class Series(IndexOpsMixin[S1], NDFrame):
self, i: Level = ..., j: Level = ..., copy: _bool = ...
) -> Series[S1]: ...
def reorder_levels(self, order: list) -> Series[S1]: ...
def explode(self) -> Series[S1]: ...
def explode(self, ignore_index: bool = ...) -> Series[S1]: ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be _bool. I think all the args in Series methods that have bool should be changed to _bool

@@ -1683,7 +1680,6 @@ class Series(IndexOpsMixin[S1], NDFrame):
) -> TimedeltaSeries: ...
@overload
def __rmul__(self, other: num | _ListLike | Series) -> Series: ...
def __rnatmul__(self, other: num | _ListLike | Series[S1]) -> Series[S1]: ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be __rmatmul__ - that was a typo, so put it back with the correct name

) -> ExponentialMovingWindow[Series]: ...
@final
def expanding(
self,
min_periods: int = ...,
axis: Axes = ...,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
axis: Axes = ...,
axis: Literal[0] = ...,

Comment on lines +1526 to 1527
s1.ewm(com=0.3, min_periods=0, adjust=False, ignore_na=True),
"ExponentialMovingWindow[pd.Series]",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is valid usage, then move it outside of the if TYPE_CHECKING_INVALID_USAGE test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants