-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
TYP: Make array _ShapeType bound and covariant #26081
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@BvB93 Ping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good change I would say, though one thing I feel is still really missing is a convenient public-facing type alias for shaped arrays (see comment below).
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Thanks @Jacob-Stevens-Haas for implementing my proposal at #16544 (comment) 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could have a few reveal tests for the new Array
type, in one of the numpy/typing/tests/data/reveal/
files (or a new test file for that), something similar to:
if sys.version_info >= (3, 11):
Length = NewType("Length", int)
Width = NewType("Width", int)
arr: npt.Array[Length, Width, np.int8]
assert_type(arr, npt.Array[int, int, np.int8])
Ok, so I added some tests for from typing import TypeVar
T = TypeVar("T", bound=str)
S = TypeVar("S")
def foo(a: T) -> T: ...
def bar(b: S, c: S) -> T:
return foo(b) # type error since object is not a str Unfortunately, pyright also accepts things like As a result, I've reverted the tests for |
@Jacob-Stevens-Haas I agree that it would be best if there'd be upper bounds on the variadic shape type params. |
This comment was marked as outdated.
This comment was marked as outdated.
@Jacob-Stevens-Haas Yes, option B seems to be the most obvious to me. In several of my projects I've been using effectively the same variadic alias, with sub-aliases for specific common type Vector[N: int, T: np.generic] = Array[N, T]
type Matrix[M: int, N: int, T: np.generic] = Array[M, N, T] I'm not saying you should add these here, but it's just an example of how you can add upper bound restrictions on shape type parameters, even if they are missing from |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delays, got a few mostly documentation-related comments though.
Does putting the relevant code behind a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the other build failure is unrelated to this PR. I re-triggered the build in case it was a transient issue.
There's another issue in the docs build that will need to be fixed, see below. You should be able to run the docs build locally after installing sphinx by doing "spin docs" in the root of the numpy repo.
Ah the mac CI failure is due to an issue with spin: scientific-python/spin#193. Unrelated to this PR, please ignore it. |
This comment was marked as outdated.
This comment was marked as outdated.
Yes those failures are unrelated and if you rebase or merge they'll go away. I haven't been involved with reviewing this and am not familiar with typing so I'll wait for someone else to take a final look. |
Eventually it did. There was one time I used * instead of |
There were three different issues that happened to break CI over the past few days and you hit two of them. Should all be good now. |
Thanks Joren - good catches all around. My only discussion point is that |
Alright, I think I've got all of the comments now. If it passes tests, it should be good to re-review. Do you prefer if reviewer or reviewee resolves old review comments? I tend to leave them open with a 👍 once I think we're on the right page in case you want to check them in re-review, but I know people differ on this. |
I think it'll make reviewing easier if I mark them as resolved myself |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are only a couple of minor suggestions left now.
Once those are out of the way, I think that we can go ahead and merge this :)
edit:
I noticed that the commit history has gotten quite messy, and should probably be cleaned up a bit, as described in:
https://numpy.org/doc/stable/dev/development_workflow.html#rewriting-commit-history
Would you be ok with me just squashing all commits together? It's not like there's several pieces put together here - just one change, copied across a few subclasses, the tests for that change, and the release note for that change. |
Yea that's fine by me |
Fixes #25729 This change allows future changes to the static typing of numpy that modify or only work with certain numbers of dimensions. It also applies the change to subclasses of ndarray and adds tests. It allows users to statically type their array shapes with subtypes of tuple (e.g. NamedTuple) and tuples of int subtypes (e.g. Literal or NewType). For a discussion of the merits of TypeVarTuple vs a tuple-bound TypeVar, see the linked PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left two tiny suggestions, and a bonus one.
Once these two are fixed, we can merge this AFAIK
it was decided to create a separate PR for the shaped type alias
Thanks a lot @Jacob-Stevens-Haas! Everyone can now go ahead and start typing the shapes of arrays 🎉. I'm looking forward to your PR for the shaped |
Thanks again for your help and enthusiasm @jorenham! I'm curious whether you think it's worth working on the PEP for |
Please do! Perhaps you could also allow
Well, besides the covariance and bounds issues, there are other reasons why I think that a plain tuple So this is not a problem: type ArrayTup[ShapeT: tuple[int, ...], SCT: generic] = ndarray[ShapeT, dtype[SCT]]
type ArrayVar[*Shape: int, SCT: generic] = ArrayTup[tuple[*Shape], SCT] But the other way around is impossible: type ArrayVar[*Shape: int, SCT: generic] = ndarray[tuple[*Shape], dtype[SCT]]
type ArrayTup[ShapeT: tuple[int, ...], SCT: generic] = ArrayVar[???, SCT]
Don't underestimate the amount of bureaucracy and discussion that's involved in attempting to change things like this: I once made an attempt at a similar PEP that allowed specifying co- and contra- variant type params as I'm not trying to demotivate you or anything, but just know what you're getting yourself into. |
Great to see this land (and just in time for 2.1.0 too), thanks a lot @Jacob-Stevens-Haas, @jorenham & all reviewers! |
Fixes #25729. Applies to main ndarray class as well as record arrays, masked arrays, masked record arrays, mmaps, and matrices.
I didn't see anything in the dev guide about testing static types for PRs, but this change fixes the MWE in the linked issue. While there hasn't been much discussion on that issue, #16544 has renewed interest, so submitting this in case it's helpful.