-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
ENH: Make the dtype objects in numpy.array_api more strict #23883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hello, I would like to work on this but before that would like to understand the problem a bit more in depth. As of now we pass dtype as a parameter in the NumPy function in the numpy.array_api, so how would the updated NumPy function look like after making changes like implementing the Dtype class as mentioned above in the dtypes.py file. Can you show an example 'dtype._np_dtype' implementation in any NumPy function. |
This way there is no ambiguity about the fact the non-portability of NumPy dtype behavior, or the fact that NumPy dtypes are not necessarily allowed as dtypes for non-NumPy array APIs. Fixes numpy#23883
Commenting here after seeing #25370 sent out: does this mean libraries that want to be compatible with the array API should focus on having a separate api-compatible namespace, rather than making the primary namespace more compatible with the API? For example in the long term, should we aim to make |
tl;dr: it's up to you, but I would recommend aiming to make your main namespace compatible. Some background here: our initial plan for So instead, our way of thinking morphed a bit. The For end-users, they are going to just be using normal NumPy arrays (as usual), or pytorch Tensors, or JAX arrays or whatever. So they want to be able to pass these to array API compatible functions. Since the main NumPy namespace isn't array API compatible, we created the compat library to wrap it in an array API compatible way. Unlike The most recent news here is that NumPy has decided to make its main namespace fully array API compatible for NumPy 2.0. This will involve a few breaking changes. This work is tracked by #25076. You can also see what sorts of changes are necessary at https://numpy.org/devdocs/reference/array_api.html. In principle, once this work is completed, the compat library will no longer be needed for NumPy arrays, although it will still be needed for pytorch and other libraries, and also it has some useful helper functions like Additionally my understanding is that I would say that our learnings from all this is that trying to make a separate namespace is a mistake. This is especially true if it would mean a separate array object. But even if it doesn't, you might as well just make your main namespace as compatible as possible, and put any remaining wrappings that can't be done in there (e.g., because of backwards compatibility concerns) in the array-api-compat library. |
Thanks for that context – that's very helpful. |
Proposed new feature or change:
In NEP 47, we decided to reuse normal
numpy
dtype objects innumpy.array_api
. However, as we've come to use the array API standard more and learn about how it is used, we've learned thatnumpy.array_api
is primarily useful as a strict namespace for libraries to test against to ensure they don't deviate from the standard. We've also learned that it's very difficult to mixnumpy.array_api
with normalnumpy
, and usage of actualnumpy
ndarray
s should take a different approach.Therefore, I think it makes more sense to use separate dtype objects in numpy.array_api, similar to how we use a distinct
Array
object. This would remove all dtype attributes and behaviors from the NumPy dtype objects that aren't guaranteed by the standard. That is, nothing should be implemented on the objects except for__eq__
, and__eq__
itself should only compare directly against the objects themselves, not against things like dtype strings.Note that this isn't particularly high priority, as currently any library that uses NumPy dtype specific behaviors will also detect this as soon as they test against PyTorch, whose dtype objects don't share anything in common with NumPy. But it is still a good idea in my opinion and fits with the overall goal of the module. With that being said, I don't plan to implement this for the 1.25 release.
If someone else wants to implement this, it should be straightforward:
Replace the current dtype objects in
numpy.array_api
with something likeEverywhere that a dtype object is passed to an actual NumPy function in the
numpy.array_api
code replacedtype
withdtype._np_dtype
.The text was updated successfully, but these errors were encountered: