-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
BUG: Distinguish exact vs. equivalent dtype for C type aliases. #21995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
For `asarray` and for the `dtype` equality operator, equivalent dtype aliases were considered exact matches. This change ensures that the returned array has a descriptor that exactly matches the requested dtype. Note: Intended behavior of `np.dtype('i') == np.dtype('l')` is to test compatibility, not identity. This change does not affect the behavior of `PyArray_EquivTypes()`, and the `__eq__` operator for `dtype` continues to map to `PyArray_EquivTypes()`. Fixes numpy#1468.
Use the correct API call to actually set the base object.
Fix a false mismatch. Separate dtype objects, even if equivalent, cause distinct array views to be created.
The updates that were necessary for the tests indicate that this change will surprise some users. Some sort of release note is warranted, and perhaps some sort of update to docs related to the Suggestions for specific docs files to update? |
* Improve comments/docs. * Improve descriptiveness of variable names. * Add additional test expressions that would not pass without this patch.
I couldn't find any docs that were specific enough to require updating, but I added a release notes doc. |
Thanks this is awesome. Those are long docs, if we know a place to put them, we probably should... For the release note, it would be nice to make it much shorter unfortuntely! EDIT: I (or someone else from us) can take a stab at that though! |
Is my proposed edit better? Or is there an alternative proposal? |
Thanks @eirrgang |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry. I was away for the weekend and just got caught up. Looks good, but I think there is a typo.
See #22227 |
Hi-five on merging your first pull request to NumPy, @eirrgang! We hope you stick around! Your choices aren’t limited to programming – you can review pull requests, help us stay on top of new and old issues, develop educational material, work on our website, add or improve graphic design, create marketing materials, translate website content, write grant proposals, and help with other fundraising initiatives. For more info, check out: https://numpy.org/contribute |
As reported in gh-22233, there is a serious issue with this PR, likely a memory leak that is causing issues for downstream projects like SciPy and MNE-Python. I will revert this PR. @eirrgang it would be great if you could resubmit this as a new PR, and then we can merge it again once the problem is understood and addressed. Maybe |
Ugh, GitHub: Sorry, this pull request couldn’t be reverted automatically. It may have already been reverted, or the content may have changed since it was merged. |
PyArray_FLAGS(oparr), | ||
op, | ||
op | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, either I just didn't think of it, or expected that this steals a reference to op
for the base
. We are missing a DECREF, I will make a PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think I originally tried PyArray_NewFromDescr
with op
for *data
and nullptr
for *obj
, and PyArray_SetBaseObject
or something that steals the reference for the base. Anyway, I guess we were sloppy when switching to PyArray_NewFromDescrAndBase
. Thanks for the quick catch!
The new path to preserve dtypes provided by creating a view got the reference counting wrong, because it also hit the incref path that was needed for returning the identity. This fixes up numpygh-21995 Closes numpygh-22233
Yes, sorry, I missed where this test was, it should be elsewhere. I had just thought it looked reasonably thorough. It might also be split up, but I don't care. In any case, we need to move this test into |
Sure. I'll make a new PR. As I recall, there wasn't an obvious good place for the test. I'll look again, and try to come up with something reasonable. |
Way too long, but |
As noted at #21995 (comment), the new test from #21995 was placed in a directory intended for the Array API, and unrelated to the change. * Consolidate test_dtype_identity into an existing test file. Remove `test_asarray.py`. Create a new `TestAsArray` suite in `test_array_coercion.py` * Linting. Wrap some comments that got too long after function became a method (with additional indentation).
This appears to have broken some tests in This shows the essence of the problem:
|
Yes. My understanding from @seberg is that this is the intended behavior, and it is reflected in the updates to the numpy tests and release notes. Please advise if you find contradictions in the documentation. Some documentation was trimmed, so you might find the commit history of the PR informative. See also https://github.com/numpy/numpy/pull/21995/files#r922726648 |
@WarrenWeckesser, yeah, that was intentional but that doesn't mean it was the best thing to do. Please do followup if you think we should consider undoing (parts) of this. |
For
asarray
and for thedtype
equality operator,equivalent dtype aliases were considered exact matches.
This change ensures that the returned array has a descriptor
that exactly matches the requested dtype.
Note: Intended behavior of
np.dtype('i') == np.dtype('l')
is to test compatibility, not identity. This change does not
affect the behavior of
PyArray_EquivTypes()
, and the__eq__
operator fordtype
continues to map toPyArray_EquivTypes()
.Fixes #1468.