Skip to content

TYP: Stop using Any as shape-type default #27211

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 2, 2024

Conversation

jorenham
Copy link
Member

@jorenham jorenham commented Aug 14, 2024

This replaces the default shape-type of numpy.ndarray and its subtypes from Any to tuple[int, ...] (and tuple[int, int] for numpy.matrix).

This might seem trivial, since the shape-type parameter is already bound to tuple[int, ...].
However, ndarray[Any, _] is very different from ndarray[tuple[int, ...], _]:

from typing import Any
import numpy as np

arr_xx: np.ndarray[Any, np.dtype[np.bool]]
arr_nd: np.ndarray[tuple[int, ...], np.dtype[np.bool]]

reveal_type(arr_xx.shape)  # Type of "arr_xx.shape" is "Any"
reveal_type(arr_nd.shape)  # Type of "arr_nd.shape" is "tuple[int, ...]"

This is clearly not what we want: ndarray.shape should always be a subtype of tuple[int, ...], but now it is Any, which clearly isn't a subtype of tuple[int, ...], but the opposite, i.e. a supertype. That should never be allowed to happen, because the shape TypeVar is covariant and bound.

And yet, it did...

... and here's why:

def only_pass_a_shape(shape: tuple[int, ...]) -> None: ...

not_a_shape: Any
_ = only_pass_a_shape(not_a_shape)  # this is allowed!
So Any behaves as the supertype of everything, but at the same time it also is the subtype of everything (like Never).

This violates the (very) fundamental axiom that if T is both super- and a subtype of S, i.e. T <: S <: T, then T == S (the analogue of that a == b iff a <= b <= a for integers, which follows from the Peano axioms).

Currently, the numpy.typing.NDArray type-alias uses Any for ndarray, but there are many more examples of this.


This PR tries to avoid such unexpected Any behavior for shape-type defaults, by replacing it with tuple[int, ...], so that it matches the upper type bound (as opposed to maximally exceeding it).

It turns out that this also solves a couple of typing-bugs for several functions, including np.copy, np.concatenate, and np.(v|h|)stack. See the changed tests for details.


And this isn't backwards-incompatible as you might initially think. For those that only use npt.NDArray in their codebase (which I'm guessing is the majority of numpy-typing users), this wouldn't even be noticed.
And for those that use e.g. a "manual" _: np.ndarray[Any, _], nothing changes either; because after all, Any can be used everywhere, and it matches everything.
It'll only be those few "power-users" that are already using shape-typing in their codebase, that might notice a difference (I might actually be the only one; but who knows 🤷🏻). But even so, I'm guessing it won't be a big deal for these power-users to change some annotations (i.e. remove some workarounds that aren't needed anymore).


Edit

I forgot to mention that this is the case in both mypy (1.11.1) and pyright (1.1.375).
But the typing docs are a bit vague about this, so it might not be universal type-checker behavior.


Edit 2

Probably the most important issue with Any as shape-type, is that it makes it impossible to write overloads for the shape. This is because ndarray[Any, DT] will match on an overload for e.g. tuple[int, ...], but also for tuple[int]. And there's no (universal) way to "filter out" the Any using an overload.

In other words, shape-typing is currently practically impossible (and attempting to do so causes very confusing type-checker behavior).

Currently, we can't even properly type something as trivial as numpy.shape(ndarray[tuple[int]], dtype[bool]]), and the best we can do with mypy is tuple[int, ...]. See: #27210 for the implementation.

@charris
Copy link
Member

charris commented Aug 30, 2024

Needs rebase.

@jorenham jorenham force-pushed the typing/NDArray-shape-default branch from 11606e9 to 0e05de4 Compare September 2, 2024 10:41
@charris charris merged commit bb443be into numpy:main Sep 2, 2024
66 checks passed
@charris
Copy link
Member

charris commented Sep 2, 2024

Thanks @jorenham .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants