-
-
Notifications
You must be signed in to change notification settings - Fork 146
Add defaults for parameters #1293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
start: TimedeltaConvertibleTypes = ..., | ||
end: TimedeltaConvertibleTypes = ..., | ||
periods: int | None = ..., | ||
start: TimedeltaConvertibleTypes | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docstring says the default is None so I had to change the annotation
Based on the docstring, this should probably be rewritten to have overloads:
Of the four parameters
start
,end
,periods
, andfreq
,
exactly three must be specified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is true for date_range()
as well
Just for my own knowledge, what is adding the defaults instead of the ellipsis trying to achieve? Is it to help when looking at the docs in the IDE, for pyrefly, or something else? |
@@ -1133,7 +1141,7 @@ class Series(IndexOpsMixin[S1], NDFrame): | |||
self, | |||
periods: int | Sequence[int] = ..., | |||
freq: DateOffset | timedelta | _str | None = ..., | |||
axis: Axis = ..., | |||
axis: Axis | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this annotation changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs aren't right here for Series.align
It's to help with looking at docs/signature in the IDE. This work is unrelated to Pyrefly. Here's a thread on typeshed w/ some discussion on the pros and cons of defaults: python/typeshed#8988 The consensus seems to be that if the default is simple (like None, or some literal) then it's fine, but if it's complex we use The second reason I made the PR is just to test out this way of doing LLM-assisted codemods. I have another PR in the works that uses the same approach to add a few hundred types for un-annotated parameters and returns, but that one requires more manual filtering to make sure the generated types are good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked every method you changed by looking in the docs. It picks up things that we need to delete or modify anyway. Some general comments:
- I use the documented signatures to determine default values, not just what the docs say, because the docs don't always indicate the correct default value, but the signature does because it is picked up in the docs from the actual python declaration.
- The
copy
argument often has a default value ofNone
, so I had to change the declaration a lot of times in your PR, and probably missed a few.
It might be worth modifying your script to look at the default values that are in the signatures rather than the docs. If you do that, you'll pick up a lot more defaults (and probably catch a bunch that I caught via this review).
This PR points out that the pandas docs could be improved, but creating an issue for each improvement would be a lot of work.
When you update, please don't force push, and don't resolve conversations, because then I can see if everything I found was addressed, and can just concentrate on what is changed in your next commit(s).
Thanks for doing this.
@@ -197,12 +197,12 @@ class Period(PeriodMixin): | |||
def day_of_year(self) -> int: ... | |||
@property | |||
def day_of_week(self) -> int: ... | |||
def asfreq(self, freq: str | BaseOffset, how: _PeriodFreqHow = ...) -> Period: ... | |||
def asfreq(self, freq: str | BaseOffset, how: _PeriodFreqHow = "end") -> Period: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure whether this matters. Code has default of how
as "E"
, but docs say "end"
I'm fine with leaving it as you have it, but open to which way to go.
@classmethod | ||
def now(cls, freq: str | BaseOffset = ...) -> Period: ... | ||
def strftime(self, fmt: str) -> str: ... | ||
def to_timestamp( | ||
self, | ||
freq: str | BaseOffset | None = ..., | ||
how: _PeriodToTimestampHow = ..., | ||
how: _PeriodToTimestampHow = "S", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same issue as in asfreq()
. docs say "S"
, but code says "start"
I'm fine with leaving it as you have it, but open to which way to go.
@@ -157,7 +157,7 @@ class Timestamp(datetime, SupportsIndex): | |||
) -> Timestamp: ... | |||
def astimezone(self, tz: _tzinfo | None = ...) -> Self: ... | |||
def ctime(self) -> str: ... | |||
def isoformat(self, sep: str = ..., timespec: str = ...) -> str: ... | |||
def isoformat(self, sep: str = "T", timespec: str = "auto") -> str: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def isoformat(self, sep: str = "T", timespec: str = "auto") -> str: ... | |
def isoformat(self, sep: str = "T", timespec: Literal[‘auto’, ‘hours’, ‘minutes’, ‘seconds’, ‘milliseconds’, ‘microseconds’, ‘nanoseconds’]= "auto") -> str: ... |
pandas-stubs/core/series.pyi
Outdated
periods: int = ..., | ||
fill_method: None = ..., | ||
periods: int = 1, | ||
fill_method: None = None, | ||
freq: DateOffset | timedelta | _str | None = ..., |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
freq: DateOffset | timedelta | _str | None = ..., | |
freq: DateOffset | timedelta | _str | None = None, |
pandas-stubs/core/series.pyi
Outdated
periods: int = ..., | ||
fill_method: None = ..., | ||
periods: int = 1, | ||
fill_method: None = None, | ||
freq: DateOffset | timedelta | _str | None = ..., | ||
*, | ||
fill_value: Scalar | NAType | None = ..., |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove fill_value
param - not documented
pandas-stubs/core/series.pyi
Outdated
span: float | None = ..., | ||
halflife: float | None = ..., | ||
alpha: float | None = ..., |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
span: float | None = ..., | |
halflife: float | None = ..., | |
alpha: float | None = ..., | |
com: float | None = None, | |
span: float | None = None, | |
halflife: float | None = None, | |
alpha: float | None = None, |
github wouldn't let me do suggestion on com
@@ -1950,67 +1958,67 @@ class Series(IndexOpsMixin[S1], NDFrame): | |||
self, | |||
other: num | _ListLike | Series[S1], | |||
level: Level | None = ..., | |||
fill_value: float | None = ..., | |||
fill_value: float | None = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be changed in mul
too
pandas-stubs/core/series.pyi
Outdated
def pow( | ||
self, | ||
other: num | _ListLike | Series[S1], | ||
level: Level | None = ..., | ||
fill_value: float | None = ..., | ||
fill_value: float | None = None, | ||
axis: AxisIndex | None = ..., | ||
) -> Series[S1]: ... | ||
def prod( | ||
self, | ||
axis: AxisIndex | None = ..., |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all of these operations, the axis
value defaults to 0. There are lots of them in here - I'm commenting on this one, but you should fix the others. I think it is the case that if axis
is an argument to a Series
method, the default value is 0, because that's the only value that makes sense.
I looked further into the |
Thanks for the detailed review! That's a good point about taking the default values directly from the function declaration. I've pushed a commit doing that using https://gist.github.com/yangdanny97/170f82ee5389584f8b6292bc4ea9c24d I'll go thru the comments one by one and address them over the next few days, and keep track of which followup issues we need to file. |
Closes #1292
This PR uses a semi-automated/LLM-assisted method to add a few hundred parameter defaults to pandas-stubs, replacing what was previously
...
.There were 3 annotations that had to change because their type signatures are not compatible with the default.
The codemod that made these transformations is here: https://gist.github.com/yangdanny97/0d81e16f6374582413dd5bcf90d902fd
How it works is:
...
The main thing that needs careful reviewing is the
default_type_map
which maps the docstring snippet to the type annotation we replace the...
with. That part was generated with chatGPT's help (I gave it every snippet I got after step 2), and I commented out the mappings that seemed wrong or unclear. So, even though this was "LLM-assisted" the script is deterministic and will always give you the same thing each time, and the LLM is only involved in a one-time manual step.