Skip to content

TYP: Type default values in stubs in numpy/ma #29531

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

MarcoGorelli
Copy link
Member

@MarcoGorelli MarcoGorelli commented Aug 7, 2025

Making some progress towards #28428

Similar PR in pandas: pandas-dev/pandas-stubs#1293

I've done this based on work started in https://gist.github.com/yangdanny97/170f82ee5389584f8b6292bc4ea9c24d, we're looking at open-sourcing a reusable tool to do this automatically where possible:

  • if a stub file uses = ...
  • and the corresponding defintion has a simple default
  • then fill the ... in

The ones in this PR, I checked manually, and they look correct to me

@jorenham jorenham added the component: numpy.ma masked arrays label Aug 8, 2025
@jorenham jorenham self-requested a review August 11, 2025 18:50
Copy link
Member

@jorenham jorenham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the libcst approach, and the ones I checked seem to all be correct.

Technically speaking, this has little to do with static typing, but can be very helpful for IDE introspection, which is also one of the main advantages of annotations, so I suppose it's fine to keep the TYP: label.

For most defaults I can see that they can be useful. But in some cases, like out=None, I'm not if the defaults are actually helpful. Because without annotations, just having out=None doesn't give you any additional information about how it can be used. It could even be a bit confusing this way if some parameters use =None and others =np._NoValue (i.e. =...), especially if you consider that the documentation of _NoValue often incorrectly says it defaults to None. That could be confusing because it appears to be inconsistent.

Anyway, it probably doesn't matter much, so I'm fine with keeping those =None. Removing them now would mean we'd have to add them back in again once we add the annotations.

get_data = getdata

def fix_invalid(a, mask=..., copy=..., fill_value=...): ...
def fix_invalid(a, mask=..., copy=True, fill_value=None): ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reasoning behind PYI014 doesn't make much sense to me, and their definition of a "simple" value seems pretty arbitrary.
So I wouldn't mind ignoring it and leaving it up to our own judgement whether we use ... or e.g. a np.False_ like in this case:

Suggested change
def fix_invalid(a, mask=..., copy=True, fill_value=None): ...
def fix_invalid(a, mask=np.False_, copy=True, fill_value=None): ... # noqa: PYI014

but that's just what I think, and I'll leave that decision to you

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your review

I'm OK with using non-simple defaults, but it should be kept in sync, right? because in the .py file there's nomask, not np.False_. I get that nomask is an alias for np.False_, but a static analysis tool doesn't 😄 So, this would be a very manual effort to do it across the codebase, would that be ok? (tbh I think the diff is quite big already, shall we leave that to a separate discussion?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that nomask is an alias for np.False_, but a static analysis tool doesn't 😄

Well, according to ruff's PYI014 docs:

Stub (.pyi) files exist to define type hints, and are not evaluated at runtime. As such, function arguments in stub files should not have default values, as they are ignored by type checkers.

But that's not true, because def f(_: int = ""): ... would be reported as an error in a .pyi. And the same error will be reported when you replace the literal "" with a constant:

from typing import Final, Literal

C: Final[Literal[""]] = ""

def f(_: int = C) -> None: ...

here, pyright reports

Expression of type "Literal['']" cannot be assigned to parameter of type "int"
  "Literal['']" is not assignable to "int"

and mypy says

Incompatible default for argument "_" (default has type "Literal['']", argument has type "int")

That means that in case of nomask, if we were to annotate it as e.g. nomask: Final[np.bool[Literal[False]]] = ..., mypy and pyright will treat mask=nomask in exactly the same way as mask=np.False_.

But it's indeed true that tools like pylance will not show mask=nomask when it's actually mask=np.False_, or vice-versa. So it would probably be better to use nomask as default here instead of np.False_.

But then again, I'm also fine with listening to ruff here. We can always reconsider one we actually annotate these functions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry i meant that to a libcst / ast based tool, there's no knowledge that nomask corresponds to np.False_ (unless we start following imports, but currently the tool is just file-per-file, like ruff checks usually are)

So it would probably be better to use nomask as default here instead of np.False_.

nice, this is what I was hoping for 🙌

def make_mask(m, copy=..., shrink=..., dtype=...): ...
def make_mask_none(newshape, dtype=...): ...
def mask_or(m1, m2, copy=..., shrink=...): ...
def make_mask(m, copy=False, shrink=True, dtype=...): ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def make_mask(m, copy=False, shrink=True, dtype=...): ...
def make_mask(m, copy=False, shrink=True, dtype=np.bool): ...

🤷🏻

def masked_outside(x, v1, v2, copy=True): ...
def masked_object(x, value, copy=True, shrink=True): ...
def masked_values(x, value, rtol=1e-5, atol=1e-8, copy=True, shrink=True): ...
def masked_invalid(a, copy=True): ...

class _MaskedPrintOption:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the enable method shrink parameter defaults to 1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooh, thanks! looks like the script wasn't picking up methods in class functions yangdanny97/docs2types#5

def power(a, b, third=...): ...
def argsort(a, axis=..., kind=..., order=..., endwith=..., fill_value=..., *, stable=...): ...
def power(a, b, third=None): ...
def argsort(a, axis=..., kind=None, order=None, endwith=True, fill_value=None, *, stable=...): ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the libcst codemod support keyword-only parameters?

Suggested change
def argsort(a, axis=..., kind=None, order=None, endwith=True, fill_value=None, *, stable=...): ...
def argsort(a, axis=..., kind=None, order=None, endwith=True, fill_value=None, *, stable=None): ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, fixed, thanks! yangdanny97/docs2types#6

def transpose(a, axes=...): ...
def reshape(a, new_shape, order=...): ...
def transpose(a, axes=None): ...
def reshape(a, new_shape, order='C'): ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize that there are some existing ' quotes here and there, but " is used way more often. I'm kinda surprised that ruff accepts this though 🤔

Suggested change
def reshape(a, new_shape, order='C'): ...
def reshape(a, new_shape, order="C"): ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really have a preference, but I think if it's a project preference then it should be automated - I've opened #29548 for this

def where(condition, x=..., y=...): ...
def choose(indices, choices, out=..., mode=...): ...
def round_(a, decimals=..., out=...): ...
def choose(indices, choices, out=None, mode='raise'): ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def choose(indices, choices, out=None, mode='raise'): ...
def choose(indices, choices, out=None, mode="raise"): ...

Comment on lines +2358 to +2359
def correlate(a, v, mode='valid', propagate_mask=True): ...
def convolve(a, v, mode='full', propagate_mask=True): ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def correlate(a, v, mode='valid', propagate_mask=True): ...
def convolve(a, v, mode='full', propagate_mask=True): ...
def correlate(a, v, mode="valid", propagate_mask=True): ...
def convolve(a, v, mode="full", propagate_mask=True): ...

@@ -55,7 +55,7 @@ __all__ = [
"vstack",
]

def count_masked(arr, axis=...): ...
def count_masked(arr, axis=None): ...
def masked_all(shape, dtype=...): ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ruff PYI014 wouldn't accept this I think, which is pretty arbitrary if you ask me.

Suggested change
def masked_all(shape, dtype=...): ...
def masked_all(shape, dtype=float): ... # noqa: PYI014

Comment on lines +88 to +89
commentchar='#',
missingchar='',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
commentchar='#',
missingchar='',
commentchar="#",
missingchar="",

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants