ENH: Make numpy.array_api more portable #25370

asmeurer · 2023-12-12T00:12:18Z

Make .device return a special CPU_DEVICE object (not present in the namespace) instead of "cpu". See to_device() -- any way to force back to host "portably?" data-apis/array-api#626.
Make dtype objects separate classes, rather than reusing NumPy dtype objects. Fixes ENH: Make the dtype objects in numpy.array_api more strict #23883
Fix nonzero to disallow 0-D inputs (this is not related to portability, but it was a small change so I figured I would just include it here).
Update the array API compatibility document accordingly.

These will make it harder to write non-portable code that successfully works with numpy.array_api.

This way, it does not appear that "cpu" is a portable device object across different array API compatible libraries. See data-apis/array-api#626.

This way there is no ambiguity about the fact the non-portability of NumPy dtype behavior, or the fact that NumPy dtypes are not necessarily allowed as dtypes for non-NumPy array APIs. Fixes numpy#23883

This is required by the standard, even though np.nonzero does not yet disallow it.

asmeurer · 2023-12-12T00:14:10Z

numpy/array_api/_dtypes.py

+    def __repr__(self):
+        return f"np.array_api.{self._np_dtype.name}"
+
+    def __eq__(self, other):


I'm thinking it might be a good idea to make == explicitly raise an exception when compared against a numpy dtype object. Right now numpy.float32 == numpy.array_api.float32 returns False, but I feel like this could mask bugs. Is there a straightforward way to detect if other is a NumPy dtype object?

This is a very fly-by comment, but in python generally == is supposed never to fail.

Yes, I'm not completely sure about it. I did set this to error when I was creating this PR and it definitely helped me to find all the places in the code that needed to be updated.

My worry if is someone is incorrectly doing:

if x.dtype == np.float32: # x is a numpy.array_api array ...

or

if x.dtype == 'f4': ...

The code will just subtly start being wrong.

But then again, I don't know if that kind of error actually exists in the wild. It is already wrong for other libraries like PyTorch.

OTOH, like you say, == failing can be very annoying as it would make things like in not work properly. I think I'm leaning towards not making this change, but I'm interested to hear what others think.

Notwithstanding that this should move eventually. For this namespace it seems right to raise errors liberally, since I think by now it is accepted as a testing implementation: If it works with this, it should work with anything.

How about a warning? For testing purposes, those can be turned into errors.

I don't have an opinion, TBH. Besides testing, I never saw much purpose to the namespace as is and in that context I think liberally raising is OK too.

OTOH, if you formalize the "this is for testing" part of the namespace, then it does seem quite logical to have a custom UndefinedBehavior warning. That would make it easier to ignore it in tests if there is a reason to do so (not sure there is, but maybe it doesn't matter).

(or maybe UnspecifiedBehavior since UB in C is usually a bit scarier.)

A warning sounds like a nice middle ground to me. But an exception is also defensible, and better than silently passing for unsupported objects to compare against.

charris · 2023-12-12T18:53:53Z

This could use a release note.

numpy/array_api/_typing.py

rgommers

Thanks @asmeurer. This looks quite close, I found one issue when testing this on SciPy. Also the == warning/exception for dtype comparison needs to be finalized.

rgommers · 2024-01-16T12:59:36Z

doc/release/upcoming_changes/25370.compatibility.rst

@@ -0,0 +1,14 @@
+Make ``numpy.array_api`` more portable


Let's keep this release note; it can be cleaned up if and when we split off the whole numpy.array_api module into a standalone package.

rgommers · 2024-01-16T13:04:36Z

numpy/array_api/_dtypes.py

+    def __repr__(self):
+        return f"np.array_api.{self._np_dtype.name}"
+
+    def __eq__(self, other):


A warning sounds like a nice middle ground to me. But an exception is also defensible, and better than silently passing for unsupported objects to compare against.

rgommers · 2024-01-16T13:07:41Z

numpy/array_api/_dtypes.py

+
+    def __hash__(self):
+        # Note: this is not strictly required
+        # (https://github.com/data-apis/array-api/issues/582), but makes the


This does seem like a good idea to add. And we should revisit the topic in the standard at some point.

numpy/array_api/_creation_functions.py

rgommers · 2024-01-16T15:33:42Z

Resolving the "2 blank lines" part of the linter complaints would also be nice to take along.

These surfaced with the more strict reference implementation in numpy/numpy#25370 and with the addition of a `__array_namespace__` method to `numpy.ndarray` for NumPy 2.0 (which caused the `is_numpy` utility function to be wrong). Even if the `is_numpy` issue is also handled by a change in array-api-compat, this still seems like a logical change to make; if `x.__name__` is `'numpy'` then `is_numpy(x)` should return True. [skip cirrus] [skip circle]

…py dtypes This is to prevent user error, since something like numpy.array_api.float32 == numpy.float32 gives False.

rgommers · 2024-01-16T22:55:35Z

The test_warning_calls failures are real.

And now that gh-25317 is merged, it'd be good to update this PR for device handling of fftfreq/rfftfreq.

… object

rgommers · 2024-01-17T17:17:04Z

This LGTM now, but the added strictness is hard to deal with without the fix suggested in data-apis/array-api-compat#77 (comment). So ideally that would be fixed first before merging this.

These surfaced with the more strict reference implementation in numpy/numpy#25370 and with the addition of a `__array_namespace__` method to `numpy.ndarray` for NumPy 2.0 (which caused the `is_numpy` utility function to be wrong). Even if the `is_numpy` issue is also handled by a change in array-api-compat, this still seems like a logical change to make; if `x.__name__` is `'numpy'` then `is_numpy(x)` should return True. Regarding the tightening up of bool dtype handling in `_lib.lazywhere`: plain `bool` is technically not guaranteed to work, and the stricter checks in the reference implementation will start rejecting it. [skip cirrus] [skip circle]

asmeurer · 2024-01-17T21:23:41Z

@rgommers I'm not following how that array-api-compat issue relates to this.

rgommers · 2024-01-17T21:44:06Z

@rgommers I'm not following how that array-api-compat issue relates to this.

I ran into that issue when testing this PR with SciPy. All my attempts at fixing up SciPy to be robust against changes in this PR are blocked until that issue is fixed. Basically, the addition of numpy.ndarray.__array_namespace__ changed the behavior of array_api_compat - but that was still okay when this module had some flexibility. The new dtype objects added in this PR are very restrictive, so it's impossible to get away with mixing numpy and numpy.array_api, or using dtype=bool (the Python builtin bool) after this gets merged.

tl;dr I don't really want to leave SciPy (and scikit-learn) in a state where tons of tests fail with NumPy main.

asmeurer · 2024-01-17T21:49:34Z

I guess there must be some complexity in the scipy implementation for this that to be related to this. I'm surprised you have array-api compatible code that's mixing numpy dtypes, since that already doesn't work for pytorch or cupy.

rgommers · 2024-01-21T21:21:05Z

Now that scipy/scipy#19885 is merged, let's give this a go too. Thanks @asmeurer & reviewers!

rgommers · 2024-01-21T21:23:43Z

@asmeurer it looks like we're ready now for you to create the standalone version of numpy.array_api, so that we can make an announcement and switch over the known consumers and remove numpy.array_api before the 2.0.x branch point.

asmeurer added 4 commits December 11, 2023 12:43

Replace device="cpu" with a special object in numpy.array_api

3b20ad9

This way, it does not appear that "cpu" is a portable device object across different array API compatible libraries. See data-apis/array-api#626.

Use separate wrapped dtype objects in numpy.array_api

13ab654

This way there is no ambiguity about the fact the non-portability of NumPy dtype behavior, or the fact that NumPy dtypes are not necessarily allowed as dtypes for non-NumPy array APIs. Fixes numpy#23883

Disallow 0-dimensional arrays in numpy.array_api.nonzero

28a15d5

This is required by the standard, even though np.nonzero does not yet disallow it.

Update some items in the array_api compatibility document

84a4605

asmeurer commented Dec 12, 2023

View reviewed changes

jakevdp mentioned this pull request Dec 12, 2023

ENH: Make the dtype objects in numpy.array_api more strict #23883

Closed

rgommers added the component: numpy.array_api label Dec 12, 2023

charris changed the title ~~Make numpy.array_api more portable~~ ENH: Make numpy.array_api more portable Dec 12, 2023

Add a release notes entry

64c7478

BvB93 reviewed Dec 13, 2023

View reviewed changes

numpy/array_api/_typing.py Outdated Show resolved Hide resolved

Fix some type hint aliases for numpy.array_api

700ded8

rgommers mentioned this pull request Jan 16, 2024

ENH: Add fft optional extension submodule to numpy.array_api #25317

Merged

rgommers reviewed Jan 16, 2024

View reviewed changes

rgommers mentioned this pull request Jan 16, 2024

MAINT: fix some small array API support issues scipy/scipy#19885

Merged

asmeurer added 3 commits January 16, 2024 14:23

Fix bug in numpy.array_api.asarray with copy=True

4f2b67a

Make numpy.array_api dtypes issue a warning when compared against num…

595342c

…py dtypes This is to prevent user error, since something like numpy.array_api.float32 == numpy.float32 gives False.

Fix some linter issues

8e2af02

asmeurer added 3 commits January 16, 2024 16:18

Add stacklevel to a warning

0915030

Merge branch 'main' into array_api-portability

61f97f0

Fix the array_api fft creation functions to use the custom CPU_DEVICE…

7f354e5

… object

rgommers merged commit ad36032 into numpy:main Jan 21, 2024

rgommers added this to the 2.0.0 release milestone Jan 21, 2024

lucascolley mentioned this pull request Jan 21, 2024

MAINT: special: array types: fix warning when not in array API mode scipy/scipy#19938

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Make numpy.array_api more portable #25370

ENH: Make numpy.array_api more portable #25370

asmeurer commented Dec 12, 2023

asmeurer Dec 12, 2023

mhvk Dec 14, 2023

asmeurer Dec 14, 2023

seberg Dec 14, 2023

mhvk Dec 14, 2023

seberg Dec 15, 2023 •

edited

Loading

rgommers Jan 16, 2024

charris commented Dec 12, 2023

rgommers left a comment

rgommers Jan 16, 2024

rgommers Jan 16, 2024

rgommers Jan 16, 2024

rgommers commented Jan 16, 2024

rgommers commented Jan 16, 2024

rgommers commented Jan 17, 2024

asmeurer commented Jan 17, 2024

rgommers commented Jan 17, 2024

asmeurer commented Jan 17, 2024

rgommers commented Jan 21, 2024

rgommers commented Jan 21, 2024

ENH: Make numpy.array_api more portable #25370

ENH: Make numpy.array_api more portable #25370

Conversation

asmeurer commented Dec 12, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

seberg Dec 15, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

charris commented Dec 12, 2023

rgommers left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rgommers commented Jan 16, 2024

rgommers commented Jan 16, 2024

rgommers commented Jan 17, 2024

asmeurer commented Jan 17, 2024

rgommers commented Jan 17, 2024

asmeurer commented Jan 17, 2024

rgommers commented Jan 21, 2024

rgommers commented Jan 21, 2024

seberg Dec 15, 2023 •

edited

Loading