Skip to content

API: Remove __array_prepare__ #25105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 13, 2023
Merged

API: Remove __array_prepare__ #25105

merged 4 commits into from
Nov 13, 2023

Conversation

seberg
Copy link
Member

@seberg seberg commented Nov 10, 2023

We don't call array-prepare consistently only for normal ufuncs, it doesn't seem used by many people (the main use-case seems units and masked-arrays, both of which are clearly much better served by __array_ufunc__, even a bad version of it).

Now, there is one dance/change that needs to be OK, if the user does NOT implement __array_prepare__, we make it a true no-op now, while previously it would have done arr.view(type=subclass).

That matters in one case in the tests where a faulty __array_wrap__ implementation expected to get a subclass.

We don't call array-prepare consistently only for normal ufuncs,
it doesn't seem used by many people (the main use-case seems units
and masked-arrays, both of which are clearly much better served
by `__array_ufunc__`, even a bad version of it).

Now, there *is* one dance/change that needs to be OK, if the user
does NOT implement `__array_prepare__`, we make it a true no-op
now, while previously it would have done `arr.view(type=subclass)`.

That matters in one case in the tests where a faulty `__array_wrap__`
implementation expected to get a subclass.
.. note:: If a class defines the :func:`__array_ufunc__` method,
this disables the :func:`__array_wrap__`,
:func:`__array_prepare__`, :data:`__array_priority__` mechanism
described below for ufuncs (which may eventually be deprecated).
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe someone disagrees that this comment is useless. We don't disable it, you just have to view() as a base ndarray to avoid recursion.

If necessary, the result will be cast to the data-type(s) of the provided
output array(s).
If the output has an :obj:`~.class.__array_wrap__` method it is called instead
of the one found on the inputs.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea where that __array__ nonsense ever came from...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, very weird! I think it predates my changes to this file...

if arr.dtype.char in "SUbc":
return arr.view(type(self))
return arr

def __array_finalize__(self, obj):
# The b is a special case because it is used for reconstructing.
if not _globalvar and self.dtype.char not in 'SUbc':
if self.dtype.char not in 'SUbc':
raise ValueError("Can only create a chararray from string data.")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmap also had such a global once upon a time. I don't get it. If there is a super strange reason for why it's needed let's put it back when it shows up...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange, indeed. It looks like it protects the actual creation of the array (which will call __array_finalize__), but I'm not sure what that is needed. And tests pass...

if isinstance(obj, MyArray):
if not isinstance(obj, MyArray):
obj = obj.view(MyArray)
if obj.metadata is None:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is the one change, it now gets a plain array, not a subclass. The user can actually "detect" that case by checking if context!=None

@@ -116,7 +116,7 @@ def _raise_linalgerror_qr(err, flag):

def _makearray(a):
new = asarray(a)
wrap = getattr(a, "__array_prepare__", new.__array_wrap__)
wrap = getattr(a, "__array_wrap__", new.__array_wrap__)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, if anyone can tell me why this was changed to use array-prepare rather than array-wrap at some point, then...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always wondered about that one too...

@seberg
Copy link
Member Author

seberg commented Nov 10, 2023

This was originally introduced ~14 years ago by @ddale. But to me there is little logic in keeping this patch-work around. The one useful implementation I saw, had the purpose of raising an error on impossible ufunc calls with a Units class, but those are much clearer with __array_ufunc__.
__array_wrap__ is fine for trivial dragging around of metadata (maybe, TBH) and for non-array objects, in which case any subclass stuff doesn't really matter.

The main point is: the use in linalg doesn't even make remotely sense to me. And dealing with ufuncs is easier with __array_ufunc__.
(__array_wrap__ is different, because we use it on non-ufuncs sometimes and maybe should do so more often. Also it is just used more often in practice.)

@seberg
Copy link
Member Author

seberg commented Nov 10, 2023

@mhvk astropy has an __array_prepare__ actually? I don't understand what the purpose is. It would trigger much less consistently then __array_wrap__.
The small advantage is erroring out earlier maybe when the mask shape just doesn't add up, but overall, I am confused about why.

@mhvk
Copy link
Contributor

mhvk commented Nov 12, 2023

@seberg - I was about as surprised as you that there still was an __array_prepare__ -- it is in astropy.nddata, which I'm less familiar with, but it is hard to imagine we could not fairly easily get rid of it (replacing with __array_wrap__ as needed).

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seberg - nice to get rid of that one! My main question is whether there should be a full deprecation cycle. Right now, you remove/break ndarray.__array_prepare__ but still call out __array_prepare__ on inputs. But if a subclass does super() this will break. Might it be better to just remove it altogether? Or do a proper deprecation also of ndarray.__array_prepare__?

p.s. Numpy 2.0 perhaps give one a reason to just remove it!?

If necessary, the result will be cast to the data-type(s) of the provided
output array(s).
If the output has an :obj:`~.class.__array_wrap__` method it is called instead
of the one found on the inputs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, very weird! I think it predates my changes to this file...

if arr.dtype.char in "SUbc":
return arr.view(type(self))
return arr

def __array_finalize__(self, obj):
# The b is a special case because it is used for reconstructing.
if not _globalvar and self.dtype.char not in 'SUbc':
if self.dtype.char not in 'SUbc':
raise ValueError("Can only create a chararray from string data.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange, indeed. It looks like it protects the actual creation of the array (which will call __array_finalize__), but I'm not sure what that is needed. And tests pass...

@@ -1156,6 +1156,14 @@ prepare_ufunc_output(PyUFuncObject *ufunc,
PyArrayObject *arr;
PyObject *args_tup;

/* Use a deprecation warning, since end-users shouldn't worry. */
if (DEPRECATE(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this seems a bit strange: you deprecate the call out, but any subclass that does super().__array_prepare__(...) will now fail. Or am I missing something?

@@ -116,7 +116,7 @@ def _raise_linalgerror_qr(err, flag):

def _makearray(a):
new = asarray(a)
wrap = getattr(a, "__array_prepare__", new.__array_wrap__)
wrap = getattr(a, "__array_wrap__", new.__array_wrap__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always wondered about that one too...

@seberg
Copy link
Member Author

seberg commented Nov 13, 2023

p.s. Numpy 2.0 perhaps give one a reason to just remove it!?

That was my first inclination, but then thought it's not pressing enough that I care about removing it out-right.

Or do a proper deprecation also of ndarray.array_prepare?

This is the best I could think of. If you inherit it, you must not get warnings, I don't think I want to add an ndarray.__getattr__ for this. Neither do I want to do the dance of inspecting the bound method for what is inside on every subclass.
So this is knowingly a pragmatic deprecation without jumping through too many silly hoops for this silly function. (Yes, i forgot about super() use, but in practice I bet it is very niche.)

@mattip
Copy link
Member

mattip commented Nov 13, 2023

+1 for a direct removal from me.

Numpy 2.0 perhaps give one a reason to just remove it!?

yup

@mattip
Copy link
Member

mattip commented Nov 13, 2023

Rerunning the PyPy windows CI, which timed out after 60 minutes

@seberg seberg changed the title DEP: Deprecate __array_prepare__ API: Remove __array_prepare__ Nov 13, 2023
@seberg
Copy link
Member Author

seberg commented Nov 13, 2023

Fair, I don't really care either way, so added a commit to just out-right remove it. If we don't squash them, could go back to the less escalated version easier maybe.

@@ -1602,7 +1406,7 @@ execute_ufunc_loop(PyArrayMethod_Context *context, int masked,
}

/* The reset may copy the first buffer chunk, which could cause FPEs */
if (NpyIter_ResetBasePointers(iter, baseptrs, NULL) != NPY_SUCCEED) {
if (NpyIter_Reset(iter, NULL) != NPY_SUCCEED) {
Copy link
Member Author

@seberg seberg Nov 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to delay bufalloc (which isneeded as per comment), need to always reset, but basepointers was only for crazy array-prep support.

@seberg seberg force-pushed the depr-array-prepare branch from 688078d to d3f0bb2 Compare November 13, 2023 11:37
@seberg seberg force-pushed the depr-array-prepare branch from d3f0bb2 to a72d735 Compare November 13, 2023 12:19
Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good modulo the small comment on the docs; since that is quite trivial, I'll approve now.

Aside: should we deprecate __array_wrap__ for 2.0? I guess less easy since MaskedArray uses it, but it would be nice to get rid of it...

.. note:: For ufuncs, it is hoped to eventually deprecate this method in
favour of :func:`__array_ufunc__`.
favour of :func:`__array_ufunc__` and :func:`__array_function__`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not true for ufuncs, but I guess you were trying to include the few non-ufuncs that use it too? Since those are far fewer and really that is even more a strange leftover, maybe rewrite:

It is hoped to eventually deprecate this method in favour of
func:`__array_ufunc__` for ufuncs (and :func:`__array_function__`
for a few other functions like :func:`np.squeeze`).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adopted that. I am not even sure if we should care to ever remove it, although stopping to pass context could be on the table probably.
In a sense, there is a small use-case for very simple array-like objects, e.g. that do nothing but add a few methods or package a NumPy array. For those, __array_function__ is a bit trickier than it should be unfortunately.

@seberg
Copy link
Member Author

seberg commented Nov 13, 2023

Aside: should we deprecate array_wrap for 2.0? I guess less easy since MaskedArray uses it, but it would be nice to get rid of it...

Since I answered in the comment only. My answer would be "no". torch uses it, pandas uses it, and yes masked arrays use it and are annoying enough on their own.

More importantly, I don't think it is ideal, but it isn't super unreasonable as long as your type doesn't drag metadata around (which makes it unreasonable for masked arrays of course).
For such "simple" cases, we actually miss a great story, since __array_function__ is a tedious there.

@mhvk
Copy link
Contributor

mhvk commented Nov 13, 2023

Carrying metadata using __array_wrap__ will give somewhat incomplete coverage, but given that it seems used quite widely, absolutely fine to keep it!

@mattip mattip merged commit 74e03d3 into numpy:main Nov 13, 2023
@mattip
Copy link
Member

mattip commented Nov 13, 2023

Thanks @seberg.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants