Skip to content

Discussion: Move numpy.array_api elsewhere? #22252

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
leofang opened this issue Sep 12, 2022 · 5 comments
Closed

Discussion: Move numpy.array_api elsewhere? #22252

leofang opened this issue Sep 12, 2022 · 5 comments

Comments

@leofang
Copy link
Contributor

leofang commented Sep 12, 2022

As discussed in recent Python array API standard meetings, @seberg and others raised the question about the future of numpy.array_api. One of the reasons is that being under a namespace does not make it very useful. It was suggested that it can be made as a standalone package, for which the CuPy team is interested in making it support cupy.ndarray too to avoid code duplication. Opening this issue to gather additional feedbacks and discuss technical issues.

cc: @kmaehashi @rgommers

@seberg
Copy link
Member

seberg commented Sep 13, 2022

To clarify for others what (in my opinion) the state is:

  1. The current numpy.array_api seems mainly useful and needed for testing. However, if the main use is testing, it does not need to live in NumPy (at least not NumPy proper).
    • This namespace should probably be moved elsewhere.
  2. Libraries (who are the main audience) probably want a pragmatic implementation for use with NumPy arrays. That could be np.array_api, or it could be only available through np.ndarray.__array_namespace__().
    • This namespace needs to be created or consolidated as otherwise libraries may each end up creating their own mock version. Fortunately, it should be fairly straight forward to do so.

With respect to CuPy, the second namespace is interesting. Whatever we do for NumPy (or CuPy) should translate almost 1:1 to the other, but I suspect a change like replacing numpy with cupy can probably not be avoided?

@rgommers
Copy link
Member

I think we can indeed discuss moving the current numpy.array_api to a standalone package, especially if it helps avoid duplication. Flip sides of that coin:

  • The code will have to have explicit differences in places to deal with numpy vs cupy
  • It will have to declare a dependency on a specific version of numpy (and cupy), or at least it won't work with older numpy versions, because there's a number of fixes that went into numpy to support standard compliant behaviour
  • There's now extra overhead in testing, releasing, etc.

@leofang where do you think it should live? And do you think the overhead will go down in total, given that there's now a separate package to maintain? Also, if that exists standalone, will you still vendor it into CuPy or declare a dependency on it?

  1. Libraries (who are the main audience) probably want a pragmatic implementation for use with NumPy arrays.

This is related to gh-21135 and to data-apis/array-api#400. There's a lot more detail there, so I suggest to keep this issue about the current, strictly compliant-only version of numpy.array_api and discuss this point on those other issues.

@seberg
Copy link
Member

seberg commented Sep 19, 2022

I think we can indeed discuss moving the current numpy.array_api to a standalone package, especially if it helps avoid duplication. Flip sides of that coin:

  • The code will have to have explicit differences in places to deal with numpy vs cupy

I do not think that is necessary. CuPy should not need such an implementation since the purpose of this package will be to test that a library does not require API beyond the minimal array-API implementation (should generalize to all implementors).
To test if they work with CuPy, they should test explicitly against the new CuPy array-API implementation.

  • It will have to declare a dependency on a specific version of numpy (and cupy), or at least it won't work with older numpy versions, because there's a number of fixes that went into numpy to support standard compliant behaviour
  • There's now extra overhead in testing, releasing, etc.

Right, these are indeed annoying. OTOH, you will not be locked into the NumPy release cycle. Since this package is most interesting for testing/CI, maybe that is the bigger advantage? I.e. it is OK if tests are not run always. Tests might use a switches like: TEST_ARRAY_APIS=available or TEST_ARRAY_APIS="minimal,numpy.ndarray,cupy.ndarray so that CI will enforce running, but users won't have to worry about having the right NumPy version.

@rgommers
Copy link
Member

To test if they work with CuPy, they should test explicitly against the new CuPy array-API implementation.

I think that the reason @leofang brought this up is to keep the CuPy and NumPy version of the current, largely duplicated, code in sync. If the answer is now: "let numpy.array_api and cupy.array_api grow beyond strict compliance to be more useful", that doesn't really answer that concern. The code remains in NumPy and in CuPy and remains duplicated (you just called it new here, but it's not really new - it's a superset of what we have right now, most of it remains the same).

@lucascolley
Copy link
Contributor

it has moved!

@charris charris closed this as completed May 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants