-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
ENH: Overhaul of NumPy main namespace [NEP 52] #24306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
We decided on the name For Other more minor issues I saw reading the list over:
|
I wrapped the long table in the issue description in a details block to make this discussion thread a little easier to read. |
Just went through the matplotlib codebase looking for these. You can see a more full analysis in matplotlib/matplotlib#26422 From a matplotlib perspective, I don't think we are prepared to take on a scipy dependency (I think it may even cause a deadlock in some instances... if I remember correctly) Thus I'd argue that I'm less worried about the window functions, as those are mostly used in tests/examples anyway, and those functions are simple enough to just include inline if needed (and when they are used in library code, it is for functions that I'd argue we should deprecate ourselves..., mostly). I was also surprised to see I would also argue in favor of keeping Those are the big ones from my POV, though wouldn't exactly mind not having to update some of the others on that list, but not that bad either. |
@ksunden Thank you for the feedback! Looking briefly once again I spotted a mistake in my list: for |
I think the list above is considerably more aggressive than NEP 52. There are things I believe we can even just delete, like practically all of those strange upper-case enums (very few to no one should be using those). But things like |
@ngoldbaum I renamed entires to
@ksunden I moved @seberg I removed those upper-case enums from the main namespace. |
The errors and warnings should probably also get deprecation warnings if they're explicitly imported from the main namespace and removed from |
Do you want to remove one of the |
@timhoffm I think it would be more consistent to drop aliases and have only one function with a specific functionality. But I would assume that these core functions are heavily used, and a gain of reducing main namespace by two entries might not be worth breaking API here. But I don't have a strong opinion on that. |
@mtsokol one can definitively argue both ways. I just wanted to bring this to the table. I also don't have the usage insight and knowledge on numpy policy priorities to decide what's reasonable here. - If you keep the aliases, I suggest to still bless one and discourage the other, so that at least new code will grow into a consistent direction. |
Thanks @mtsokol to get the ball rolling here! I'm first commenting on all the feedback, then I'll have to go back and add my own comments on the individual per-function proposals.
I agree with this. Most of the feedback here (in the whole thread so far) is quite useful and on-point. I think it can be incorporated and a next version of the table/plan posted for discussion.
+1 to them not moving. They should not be in two places, only stay where they are now. And yes, definitely let's deprecate and then remove all the aliases.
This seems desirable, but a project in and of itself. data-apis/array-api#621 is relevant here.
I think
Agreed. And we know what the canonical names are |
Hi @rgommers, [UPDATE 31.08] The final list is present in the PRs description. |
A few more comments on the list below. I think if we can quickly clean up a lot of the obvious ones (stray variables, aliases, enums, etc., then the list will get a lot shorter and easier to review. Another category is the ones that should definitely stay (e.g. everything in the array API standard), as well as numpy-specific heavily used functionality without a clear replacement (e.g.
Note that these are the canonical preferred names. These are the ones we want to keep, and in addition the canonical C names. Every other alias should go. The exceptions here are the
can you undo those? I don't think this will make it into 2.0, and we are likely to keep these aliases around for a long time even if the keyword idea arrives in time.
It would be useful to double check your list with the array API standard. Anything that's in the main namespace there , like
These are about angles, and go together with |
|
@rgommers Sure, I will prepare a shorter list for further discussion.
Sure! I will revert it. To explain, for
Sure!
Sure! I will check it.
Will move these back. |
The bitsized integer types being aliases of the C integer types and not the other way around is a known issue. Ultimately it comes down to these typedefs. This also means that, depending on the platform, the C integer types may or may not be aliased to a bitsize. @seberg recently attempted to make the C type names aliases to the bitsized types but it's complicated. For now I would focus on the other aliases and come back to the integer aliases later. |
I usually think of this purely from an end user focused API perspective. It doesn't matter one way or the other what the name is of the actual implementation under the hood. It's more what the docs say or what common practice is (e.g. code uses |
@rgommers here I share an updated list divided by three sections ( [EDIT 22.08.2023] updated list to the latest version. |
That looks quite nice and easier to review, thanks Mateusz! The "remove" list looks pretty good; The "keep" list looks pretty good too. The few objects that jump out are the ones with trailing underscores - in particular I think that For the "tentative list", some more comments:
|
@rgommers Sure! I applied all points to my list and updated the comment. I've got one thing to confirm about |
Yeah, best not to touch it now - it's a little complicated. I think the plan is to reintroduce |
Might be a bit early to reintroduce it, but I am fine to do so. Also remember, at least it probably had a DeprecationWarning before that. Would lean to just not do anything about
|
I think it's pretty harmless, at least it's hard to imagine what would go wrong (typical usage was
They're trivial to recover in the rare cases where you need them, right? Like so: >>> np.bool_(False) is np.False_
True If that is correct, they really have no business being in the top-level namespace. |
Sure you can, lets see what others think. You will have to change the repr to |
I think that's not the repr? >>> np.False_.__repr__()
'False'
>>> print(np.False_)
False
I think it being a singleton should be an implementation detail that no one should rely on. Nor does it really matter. From an API perspective this looks to me like a weird object that is trivially reconstructed.
Agreed. Maybe there is a concrete use case someone can share? |
@mtsokol we chatted at the triage meeting about the tentative list and we ended up deciding that most of the tentative column that isn't already deprecated should be deprecated, except
Other less-used array utility functions can move to I think that covers everything, let me know if there are any other corner cases! |
We also discussed perhaps creating a package to put on PyPI that would restore many of the removed functions from NumPy2.0. By pip-installing this package, users could continue to use their favorite aliases and functions that were removed, perhaps with a DeprecationWarning when importing or using some of them. Does that sound like a reasonable comprimise, or would it entail too much ongoing maintenance burden? |
That sound like it has the potential to be a maintenance burden that could suffer from bitrot. |
It is good to keep in mind that the needed downstream changes should be minimal if we want to avoid a Python 3 situation. Folks can, and do, ignore deprecation warnings, the warnings don't break code. |
If any of the functions/aliases that got removed in Part 1 or Part 2 should be still available and deprecated, I can restore them. If there should be a longer deprecation period I think it's straightforward to provide it. So far removed items from the main namespace are internal enums, already deprecated functions and redundant aliases. Here's a complete list so far: https://github.com/numpy/numpy/blob/main/doc/source/release/2.0.0-notes.rst#numpy-20-python-api-removals If all removed aliases and functions from the main namespace should be available even after NumPy 2.0 release, maybe it's better to keep them in a separate "1.x legacy" module and make them injected into main namespace after enabling a flag: |
I think we want to do this in such a way that guides users to the correct way to fix their code. One alternative to deprecating things is to break them, but make sure the error users see gives the migration path. For example, if we remove aliases, we can do so in such a way that if a user imports the name or accesses it as an attribute of the numpy module they get an ImportError or AttributeError that gives them the migration path:
This allows us to do these renames and clean up namespaces, not leave the old names behind with an ignorable deprecation that only delays user pain, and break user code in such a way they hopefully immediately see what broke and how to fix their code. |
Thanks for the summary of that discussion @ngoldbaum. Overall that sounds quite good to me.
I agree with @andyfaff here that this is a maintenance burden. Also, there is no evidence that this is necessary or even desired at this point. We should not do this for now; if there is enough demand it can easily be done quick around the RC period time. Also, it's good to keep in mind that we already are doing, or are planning to do or at least are discussion the following:
At some point we've got to stop - the above list is enough. Also a reminder that we've got, according to our original planning, a little over 4 months left and a ton of work to do. We haven't even started on some of the main topics on our wish list. So let's please avoid more new work here that we didn't plan for, like a compat package.
I've now heard this one too many times, so let me write down my assessment here. First, the change of turning the NumPy 1.26 to 2.0 transition into anything like the Python 2.7 to Python 3 one is extremely low. Second, if it does happen, the changes under discussion in this issue are quite unlikely to be the root cause. Instead, it'll be because either something went wrong with the C ABI change or our assessment of the impact of it. That is still the most impactful change we are planning for, and at this point it's still not completely clear what this will look like.1 Regarding Python 3, the main causes were (highest-impact one first):
This issue is about niche APIs and changes that are not hard to adapt to. It's most similar to (4), but a lot less impactful since we're not touching anything that's idiomatic or heavily used. Footnotes
|
Hi all! |
I agree we have no big risk of a full blown Python 2-3 situation. Unfortunately, transition may not be as smooth as we hope for mainly reasons:
For Python 2-3, even the larger well maintained libs took a long time to transition I think. We will not have that problem for sure, I think (my main worry would be numba/cupy maybe due to promotion changes, but I think they may be OK with having a 95% fix for those; in practice they are only at 95% to begin with). I don't think I believe in a compat package, downgrading NumPy seems OK as a hot-fix if we go that far. |
Discussion has died down in here and all of the PRs implementing the main namespace refactor have been merged. I'm going to close this. I don't think there's much appetite for a compat package and instead I think we're going to point people to |
Hi. We noticed numpy.alltrue was gone, and there's no lint for it. Not sure if there's lints for other missing things... I didn't test for that. pygame/pygame#4239 (comment) |
Hi @illume,
|
Hi @rgommers @seberg @ngoldbaum!
NEP 52 1 (tracking issue #23999) describes cleaning up NumPy's Python API - I started rewiring it, starting from the main namespace (so with a top-down approach - first to clean-up top namespace and then step down to submodules).
Below I share top NumPy namespace (so items that are available within
np.*
) with entries to be removed from there. I took an aggressive approach to cut as much as possible (from 563 items here I propose to drop221, so 40% of the main namespace) so I assume to rather relax this list after a review. UPDATE The final list removes 18% of the main namespace.It mostly covers moving duplicates, removing multiple aliases, moving
dtype
classes and aliases tonp.dtypes
submodule proposed in NEP 52, and considers some unused/deprecated methods mentioned in previous issues/PRs.This list doesn't concern removing any function per se, only restructuring the main namespace.
Please share your feedback!
[UPDATE] Latest list can be found in comments below: #24306 (comment)[UPDATE 31.08] Here's the final list with "remove" and "stay" columns:
Footnotes
NEP 52 link ↩
The text was updated successfully, but these errors were encountered: