Skip to content

ENH: Add the linalg extension to the array_api submodule #19980

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 30 commits into from
Nov 14, 2021

Conversation

asmeurer
Copy link
Member

@asmeurer asmeurer commented Sep 27, 2021

This is an in progress PR for adding the linalg extension to the array API submodule.

@asmeurer
Copy link
Member Author

Ugh, I didn't know that @charris was going to do a squash merge of my other PR. Does anyone know how to sanely rebase this so that it keeps my existing merge conflict resolutions from aeced8f (#19980)?

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, should just keep things moving here (as soon as the rebase is cleared up). But looking through, I had some questions/comments, partly about the API standard, but maybe you can forward them there if necessary, since I could not find a discussion for a few of them.

For the rebase, maybe it is easier to just squash all of your commits and rebase after that? That way the conflicts are all in one go and don't keep recurring at least and there is probably little reason to preserve the history here anyway.

In my opinion, you could probably remove most (probably not all) of the Note: comments (or shorten them). It is clear that the code means there is a difference to NumPy proper, in some cases it is unclear/interesting what the difference is though.

@@ -992,6 +1013,12 @@ def dtype(self) -> Dtype:
def device(self) -> Device:
return "cpu"

# Note: mT is new in array API spec (see matrix_transpose)
@property
def mT(self) -> Array:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this was discussed in NumPy before (including the potential .T limitation!), time to revive that discussion!? I find it mildly annoying that we do these things in the new namespace, but seem avers to discussing it for NumPy proper. If we can get away with doing these things in NumPy it would be the best way to move code towards the better standard!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new APIs for matrix transpose were chosen so that they can be as backwards compatible as possible with libraries like NumPy that already use .T for reversing the axes. See data-apis/array-api#228 and the corresponding PRs.

Copy link
Member

@rgommers rgommers Sep 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this was discussed in NumPy before (including the potential .T limitation!), time to revive that discussion!?

Yes I think so.

I find it mildly annoying that we do these things in the new namespace, but seem avers to discussing it for NumPy proper.

I don't think we are, it's just that no one followed up/through on making matrix_transpose public, and the discussion was framed around a big backcompat break in .T, which did not help. I think no one would object to making the private matrix_transpose public (EDIT: and then add .mT), it just needs doing?

If we can get away with doing these things in NumPy it would be the best way to move code towards the better standard!

agreed

Comment on lines 397 to 398
if self.dtype not in _boolean_dtypes:
raise ValueError("bool is only allowed on boolean arrays")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why invalid dtypes sometimes raise a ValueError while in others they raise a TypeError?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely I changed it at some point in my previous PR but haven't yet backported the change to this one (I originally started this as part of #19739 but moved it to a separate branch). Thanks for the heads up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked this, there are indeed still 2 places where ValueError is used: in __float__ and in __bool__. Let's fix that.

Copy link
Member

@rgommers rgommers Nov 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, @asmeurer you seem to have done this on purpose:

def test_python_scalar_construtors():
    b = asarray(False)
    i = asarray(0)
    f = asarray(0.0)

    assert bool(b) == False
    assert int(i) == 0
    assert float(f) == 0.0
    assert operator.index(i) == 0

    # bool/int/float should only be allowed on 0-D arrays.
    assert_raises(TypeError, lambda: bool(asarray([False])))
    assert_raises(TypeError, lambda: int(asarray([0])))
    assert_raises(TypeError, lambda: float(asarray([0.0])))
    assert_raises(TypeError, lambda: operator.index(asarray([0])))

    # bool/int/float should only be allowed on arrays of the corresponding
    # dtype
    assert_raises(ValueError, lambda: bool(i))
    assert_raises(ValueError, lambda: bool(f))

    assert_raises(ValueError, lambda: int(b))
    assert_raises(ValueError, lambda: int(f))

    assert_raises(ValueError, lambda: float(b))
    assert_raises(ValueError, lambda: float(i))

    assert_raises(TypeError, lambda: operator.index(b))
    assert_raises(TypeError, lambda: operator.index(f))

Can you comment?

EDIT: diff I tried to apply but left out in the end:

$ git diff
diff --git a/numpy/array_api/_array_object.py b/numpy/array_api/_array_object.py
index 8794c5ea57..74e659e81e 100644
--- a/numpy/array_api/_array_object.py
+++ b/numpy/array_api/_array_object.py
@@ -416,7 +416,7 @@ def __bool__(self: Array, /) -> bool:
         if self._array.ndim != 0:
             raise TypeError("bool is only allowed on arrays with 0 dimensions")
         if self.dtype not in _boolean_dtypes:
-            raise ValueError("bool is only allowed on boolean arrays")
+            raise TypeError("bool is only allowed on boolean arrays")
         res = self._array.__bool__()
         return res
 
@@ -454,7 +454,7 @@ def __float__(self: Array, /) -> float:
         if self._array.ndim != 0:
             raise TypeError("float is only allowed on arrays with 0 dimensions")
         if self.dtype not in _floating_dtypes:
-            raise ValueError("float is only allowed on floating-point arrays")
+            raise TypeError("float is only allowed on floating-point arrays")
         res = self._array.__float__()
         return res
 
@@ -515,7 +515,7 @@ def __int__(self: Array, /) -> int:
         if self._array.ndim != 0:
             raise TypeError("int is only allowed on arrays with 0 dimensions")
         if self.dtype not in _integer_dtypes:
-            raise ValueError("int is only allowed on integer arrays")
+            raise TypeError("int is only allowed on integer arrays")
         res = self._array.__int__()
         return res

@asmeurer
Copy link
Member Author

@seberg most of your comments are actually about the code from the already-merged #19937, and they will disappear once I rebase this branch properly.

These functions won't be added until next year so there is no reason to have
commented out versions of them sitting around.
The functions are now all in numpy/array_api/linalg.py, and the public ones
(in the top-level namespace as opposed to the numpy.array_api.linalg extension
subnamespace) are differentiated by which names are included in
numpy.array_api.__all__.
norm() has been split into the matrix_norm() and vector_norm() functions.
These functions are not yet updated to function properly.
@asmeurer
Copy link
Member Author

asmeurer commented Oct 1, 2021

I fixed the git history (for anyone wondering, I rewound the branch pre-merge, redid the merge against main, then cherry-picked the post-merge commits).

I would appreciate more reviews on this, but I don't think it should be merged yet. The reason is that it is seriously undertested. The array API test suite has almost no tests for the linalg extension, except for very basic checks. So I would like to do some work there before I feel confident about this, especially since there are a number of differences from the base numpy.linalg functions.

The NumPy versions operate on the first two axes, but the array API diagonal
should operate on the last two, so that they work correctly on stacks of
matrices. See data-apis/array-api#241.
@asmeurer
Copy link
Member Author

I have addressed all the review commands and updated all the latest changes from the spec. This is now ready for review.

@rgommers rgommers added this to the 1.22.0 release milestone Nov 13, 2021
Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through the code again, and also reviewed all unresolved review comments in detail (fixing those up if needed). It all LGTM now, and CI is green except some "line too long" complaints which are best ignored.

So in it goes. Thanks @asmeurer & all reviewers!

@rgommers rgommers merged commit a181350 into numpy:main Nov 14, 2021
leofang pushed a commit to leofang/cupy that referenced this pull request Nov 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants