-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
DEP: Deprecate setting the strides and dtype of a numpy array #28901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error paths are both incorrect and untested, so that is definitely required.
Other than that, should have a release note as well. But I guess you had it as draft for a reason :).
numpy/_core/src/multiarray/getset.c
Outdated
@@ -124,6 +124,11 @@ array_strides_set(PyArrayObject *self, PyObject *obj, void *NPY_UNUSED(ignored)) | |||
npy_intp upper_offset = 0; | |||
Py_buffer view; | |||
|
|||
/* DEPRECATED 2025-05-04, NumPy 2.3 */ | |||
PyErr_WarnEx(PyExc_DeprecationWarning, | |||
"Setting the strides on a Numpy array has been deprecated in Numpy 2.3.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could point to strided_window_view
and stride_tricks.as_strided
. Although, not sure it is needed for this one.
msg = "Setting the strides on a Numpy array has been deprecated" | ||
arr = np.arange(48) | ||
with pytest.warns(DeprecationWarning, match=msg): | ||
arr.strides = arr.strides |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move the tests to test_deprecations.py
. They should test the error path as well (which test_deprecations
machinery does for you, although you can do it explicitly also.).
This comment was marked as resolved.
This comment was marked as resolved.
The We could use Any other ideas on how to approach this? Update: refactored the PR to use unique references for warning. This will exclude the calls from inside |
df42838
to
2fa5086
Compare
Unique reference cheeck seems OK to me, also just to be nice for a bit. Overall, I think there is no reason not to pass the As for some newer code changes here, let's avoid any warning filtering (yes I guess on new Python it's threadsafe at least, but...). Either, this still works fine now that you did the refcount check, or we should just use a (I am not certain the refcount checks will work on PyPy as is, so there might be a problem with using it for avoiding th |
af05298
to
c27f61a
Compare
The warning filtering is indeed not nice. But using a view won't work directly. For example for the recarray the dtype is updated in the array finalizer. Lines 409 to 416 in c27f61a
I can create a view with the new dtype (maybe this will create an infinite recursion, I will have to try), but I cannot replace self with the new view. |
Right, this was maybe the reason for why it wasn't deprecated before. I think we need a solution for it, though. And as much as I dislike this record array stuff and we should maybe discourage its use. My first thought is one or both of:
Unless the "unique reference" path can fix this (but I guess it didn't). The point is, if our code needs a warning manager, then we are just kicking the real solution down the road, since eventually that should be an error and the warning context manager will stop working anyway. |
b933d58
to
11f6569
Compare
Do you means checking inside the dtype setter whether the object is a subclass of masked array or recarray? That is possible (a bit inconvenient as the masked array and rec array are defined in python).
This seems a reasonable approach. Any user making use of It would be much nicer if we could rewrite the code for masked array and recarray to not update the dtype at all. If we are not able to do this, then perhaps there are users with their own subclasses that will face the same issues. For masked array the most changes are introduced in #10314 and #5943. @ahaldane @mhvk do you know whether avoiding setting the dtype in the finalizer would be possible?
The unique reference path does not help here, at the point where the recarray or masked array sets the dtype, we already have 5 references to the object.
Kicking down the solution might be ok at first. This PR is to signal users we are deprecating setting the |
I was thinking for It would be nice to just not need any of this and a great improvement to try and do something about it. But it might be a rather deep rabbit hole...
Yeah it may be fine. It would be nice not to if it isn't hard, for three reasons (none of which are quite blocking maybe):
Yes, but especially for record arrays it may be a rather deep rabbit hole as mentioned above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this is nice! Some mostly small comments in-line. My main comment echoes the discussion -- ideally, we avoid this whole _set_dtype
. Will try to have a bit of a look at least for MaskedArray
...
@@ -124,6 +124,14 @@ array_strides_set(PyArrayObject *self, PyObject *obj, void *NPY_UNUSED(ignored)) | |||
npy_intp upper_offset = 0; | |||
Py_buffer view; | |||
|
|||
/* DEPRECATED 2025-05-04, NumPy 2.3 */ | |||
int ret = PyErr_WarnEx(PyExc_DeprecationWarning, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use the DEPRECATE()
macro? As e.g. done at
numpy/numpy/_core/src/multiarray/descriptor.c
Lines 1845 to 1848 in 949554c
if (DEPRECATE("Data type alias 'a' was deprecated in NumPy 2.0. " | |
"Use the 'S' alias instead.") < 0) { | |
return NULL; | |
} |
array_descr_set(PyArrayObject *self, PyObject *arg, void *NPY_UNUSED(ignored)) | ||
{ | ||
// to be replaced with PyUnstable_Object_IsUniquelyReferenced https://github.com/python/cpython/pull/133144 | ||
int unique_reference = (Py_REFCNT(self) == 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we probably do not care too much about a missed deprecation warning in some cases on python 3.14, but we do have
numpy/numpy/_core/src/multiarray/temp_elide.c
Line 120 in 949554c
check_unique_temporary(PyObject *lhs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arr
isn't a temporary though, so it probably doesn't apply but the "unique" part of that function should apply.
|
||
def set_dtype(x): | ||
x.dtype = int | ||
self.assert_deprecated(lambda: set_dtype(x)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe simpler and clearer as self.assert_deprecated(setattr, x, "dtype", int)
?
s = x.strides | ||
x.strides = s | ||
|
||
self.assert_deprecated(lambda: set_strides(x)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above
@@ -1215,7 +1216,9 @@ def test_zero_stride(self): | |||
arr = np.broadcast_to(arr, 10) | |||
assert arr.strides == (0,) | |||
with pytest.raises(ValueError): | |||
arr.dtype = "i1" | |||
with warnings.catch_warnings(): # gh-28901 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well test that the deprecation warning happens here?
with pytest.warns(DeprecationWarning, match="Setting the dtype"):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some cases where we sometimes get a warning, and sometimes not (depending on optimizations perhaps? or pypy? refcounting is funny stuff). For that reason I picked warnings.catch_warnings()
which works for either a warning or no warning.
I will check once more whether we can use a pytest.warns
here.
@@ -366,7 +368,11 @@ def make_array(size, offset, strides): | |||
offset=offset * x.itemsize) | |||
except Exception as e: | |||
raise RuntimeError(e) | |||
r.strides = strides = strides * x.itemsize | |||
|
|||
with warnings.catch_warnings(): # gh-28901 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, I think one might as well use pytest.warns
(and below too)
|
||
# test 0d | ||
arr_0d = np.array(0) | ||
arr_0d.strides = () | ||
arr_0d = np.lib.stride_tricks.as_strided(arr_0d, strides = ()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove spaces around =
for arguments to functions (PEP8) -- also in the other cases below.
@@ -101,7 +102,10 @@ def as_strided(x, shape=None, strides=None, subok=False, writeable=True): | |||
array = np.asarray(DummyArray(interface, base=x)) | |||
# The route via `__interface__` does not preserve structured | |||
# dtypes. Since dtype should remain unchanged, we set it explicitly. | |||
array.dtype = x.dtype | |||
with warnings.catch_warnings(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer not to have _set_dtype
at all, but if we keep it, then this seems a good place to use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, the idea of having this work around, is exactly the ability to avoid any warning context managers both in NumPy (and potentially outside of NumPy).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I wanted to be on the safe side since there is a difference between .dtype = ...
and ._set_dtype(...)
for subclasses that override setting attributes. Here array is created by np.asarray
so it is fine and I will update the code.
@@ -3492,7 +3492,8 @@ def dtype(self): | |||
|
|||
@dtype.setter | |||
def dtype(self, dtype): | |||
super(MaskedArray, type(self)).dtype.__set__(self, dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused - isn't it OK to have deprecation warnings also for MaskedArray
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a valid code path from numpy where the dtype of a masked array is updated (I think it was the PyArray_View
) and the refcount is already larger than 1. In that case we do not want to generate a warning, hence the use of _set_dtype
.
The downside of this approach is that if a user changes the dtype, we will not get a deprecation warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why I was suggesting to have the warning only for ndarray
proper -- then we can just not worry about MaskedArray
and recarray
yet.
@@ -2126,7 +2126,11 @@ def test_assign_dtype(self): | |||
a = np.zeros(4, dtype='f4,i4') | |||
|
|||
m = np.ma.array(a) | |||
m.dtype = np.dtype('f4') | |||
with warnings.catch_warnings(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here for sure use pytest.warns
- we do want to check warnings are emitted also for MaskedArray
(and also below of course)
Darn, what a rabbit hole indeed! One possible solution is to start giving the warning only on
Not ideal, but at least it makes a start, and we don't need the new |
I guess I am not too worried about a |
Good point about 2.3 being around the corner. I think following my suggestion is easiest in the sense that it introduces no new API, and we can then think about how to do |
See #28800