-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
ENH: Add namedtuple return types to linalg functions that return tuples #22786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
That is, eig(), eigh(), qr(), slogdet(), and svd(). For those functions that return non-tuples with certain keyword arguments, the return type is unchanged. This change should be completely backwards compatible. The namedtuple attribute names come from the array API specification (see, e.g., https://data-apis.org/array-api/latest/extensions/generated/signatures.linalg.eigh.html), with the exception of eig() which is just the same as eigh(). The name of the namedtuple object itself is not part of the specification or the public API. I have not used a namedtuple for the tuple output for qr(mode='raw'), which returns (h, tau). This updates the docstrings to use the updated namedtuple return names, and also the examples to use those names more consistently. This also updates the tests to check each function for the namedtuple attributes at least once.
numpy/linalg/linalg.py
Outdated
@@ -17,6 +17,7 @@ | |||
import functools | |||
import operator | |||
import warnings | |||
from typing import NamedTuple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious about this, it isn't the same as collections.namedtuple. Is typing.NamedTuple the compatible version?
numpy/linalg/linalg.py
Outdated
eigenvectors: NDArray | ||
|
||
class QRResult(NamedTuple): | ||
Q: NDArray |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bit curious about the choice of upper case here and below. Is there a standard for these names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This NDArray
type name is already present in NumPy. Unless you mean the name of the namedtuple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Public names of type annotations we have, like np.typing.NDArray
, tend to be upper-case. I think to not confuse them with lower-case objects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was referring to "Q".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless you mean the name of the namedtuple.
That was what I was referring to. Looks like NamedTuple is what we want, but am curious if the projects that we want to be compatible with also use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The uppercase field names are part of the array API spec and were discussed here data-apis/array-api#295. I suppose NumPy could, if it wanted, provide the same fields under multiple names if you really don't like the ones the array API chose, like
class QRResult(NamedTuple):
Q: NDArray
R: NDArray
@property
def q(self):
return self.Q
@property
def r(self):
return self.R
(although personally I think you should just use these names)
The name of the named tuple is completely arbitrary. We can change it to whatever.
As far as I can tell NamedTuple
is basically equivalent to the classic namedtuple
. I think the biggest difference is that NamedTuple supports type annotations, which aren't used in this file yet but presumably they will be at some point. I guess it also makes it easier to add other methods or attributes if you ever wanted to do that.
I don't think it really makes a difference whether you use NamedTuple or namedtuple. I think it could even be something else like a dataclass, so long as it also has the correct field names and also defines enough methods to act like a tuple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell NamedTuple is basically equivalent to the classic namedtuple
I think NamedTuple is better, for instance, it allows accessing the fields like so: a.Q
, but that wouldn't work for namedtuple
. The declaration of the fields also differs, for instance the declarations here error if namedtuple
is substituted. The upshot is that I think one or the other should be chosen for the standard for portable code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
namedtuple works like that too:
>>> namedtuple('test', ['a', 'b'])(1, 2).a
1
I think the only real difference is that NamedTuple lets you specify type annotations. And the class definition is arguably easier to read. As far as the user is concerned though they are identical.
Could you fix the lint errors? They don't look very difficult. |
I would lean towards also using a named tuple for The changes need to be copied into the typing stubs as well still. Otherwise, I expect the only thing would indeed be small style fixups to make the linter happier. |
@asmeurer Be good to finish this up. |
We should finish this before the 1.25 release. |
What are users expected to do when writing a type annotation for the return value, if |
So maybe the namedtuple classes should be added to the API then. Maybe @BvB93 can give some insight? I'm not really knowledgeable enough about typing stuff to know the right answer here. |
I think there's still issues with
Not to the |
I suppose it would also be reasonable to think about the static types later; at worst users of heavy static typing will have to continue to use the tuple api rather than |
Runtime support for generic (i.e. parameterizable) named tuples was (re?)-introduced last year again (xref python/cpython#92027), so the issue mentioned in stackoverflow here should be irrelevant now. That, and the named tuples introduced in this PR are currently not generic anyway.
Named tuples do use a nominal sub-typing pattern just like normal classes (and contrary to protocols), so would have to expose these named tuples somewhere lest, as pointed out by Eric, the users only option would be to resort to the use of private numpy API.
Either that, or if the returning of named tuples is going to be a common thing then something akin to |
We don't have that many functions that return multiple arrays which can be named sensibly, so I suspect this isn't necessary. |
Branching for 1.25 will be approaching, might be nice to push this over the finish line. |
Co-authored-by: Bas van Beek <43369155+BvB93@users.noreply.github.com>
The circleci failure seems to be because the main version of |
Just linking because I didn't really expect this pattern of someone noticing: HIPS/autograd#597 (comment) They have a dictionary for the result types, which probably only contains |
🤔 I think I'd consider that an issue in their code, changing |
That is, eig(), eigh(), qr(), slogdet(), and svd(). For those functions that return non-tuples with certain keyword arguments, the return type is unchanged. This change should be completely backwards compatible.
The namedtuple attribute names come from the array API specification (see, e.g.,
https://data-apis.org/array-api/latest/extensions/generated/signatures.linalg.eigh.html), with the exception of eig() which is just the same as eigh(). The name of the namedtuple object itself is not part of the specification or the public API.
I have not used a namedtuple for the tuple output for qr(mode='raw'), which returns (h, tau).
This updates the docstrings to use the updated namedtuple return names, and also the examples to use those names more consistently. This also updates the tests to check each function for the namedtuple attributes at least once.