-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
BUG: Disallow type-promotion of dtypes with different field names #15509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
493db3e
to
df1ae03
Compare
Why does |
As I recall, casting (eg, This PR has doesn't affect |
6695f7a
to
4e840ae
Compare
numpy/core/tests/test_multiarray.py
Outdated
# gh-15494 | ||
A = ("a", "<i8") | ||
B = ("b", "<i8") | ||
ab = np.rec.array(obj=np.array([(1, 2), dtype='<i8']), dtype=[A, B]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not throw recarray
in here, the issue is more fundamental:
ab = np.rec.array(obj=np.array([(1, 2), dtype='<i8']), dtype=[A, B]) | |
ab = np.array([(1, 2)], dtype=[A, B]) |
numpy/core/tests/test_multiarray.py
Outdated
A = ("a", "<i8") | ||
B = ("b", "<i8") | ||
ab = np.rec.array(obj=np.array([(1, 2), dtype='<i8']), dtype=[A, B]) | ||
ba = np.rec.array(obj=np.array([(1, 2), dtype='<i8']), dtype=[B, A]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ba = np.rec.array(obj=np.array([(1, 2), dtype='<i8']), dtype=[B, A]) | |
ba = np.array([(1, 2)], dtype=[B, A]) |
Oh, yeah, this PR isn't quite right. It should totally ignore the field names, since can_cast should only depend on the field order.... working on it |
Ah, but if we do that we'd be declaring the concatenate behavior as intentional, which I'm not sure is right - or at least, it wouldn't make the problem go away. |
|
||
val = PyObject_RichCompareBool(from->names, to->names, Py_EQ); | ||
if (val != 1 || PyErr_Occurred()) { | ||
PyErr_Clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you should clear the error here, else MemoryError
translates to "can't cast" which is nonsense.
while (PyDict_Next(field1, &ppos, &key, &tuple1)) { | ||
if ((tuple2 = PyDict_GetItem(field2, key)) == NULL) { | ||
while (PyDict_Next(from->fields, &ppos, &key, &tuple1)) { | ||
if ((tuple2 = PyDict_GetItem(to->fields, key)) == NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if ((tuple2 = PyDict_GetItem(to->fields, key)) == NULL) { | |
if ((tuple2 = PyDict_GetItemWithError(to->fields, key)) == NULL) { | |
if (PyErr_Occurred()) { | |
return -1; | |
} |
It seems like we want for two dtypes with different field names but same field dtypes:
So can_cast should succeed, but promote_types should fail. Does that make sense? |
4e840ae
to
62a732f
Compare
All right. So I fixed up There was a complication, that when doing I also added a paragraph to the structured array docs about all this. |
fee0266
to
85624c5
Compare
@@ -152,7 +152,8 @@ def test_recarrays(self): | |||
|
|||
self._test_equal(a, b) | |||
|
|||
c = np.empty(2, [('floupipi', float), ('floupa', float)]) | |||
c = np.empty(2, [('floupipi', float), | |||
('floupi', float), ('floupa', float)]) | |||
c['floupipi'] = a['floupi'].copy() | |||
c['floupa'] = a['floupa'].copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change was needed because now the comparison 3 lines below here falls into the elementwise comparison case, since b and c were now castable after thie PR. So here I added an extra field to make sure b and c are not castable so they trip the warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good, thanks!
Some style comments, and a test I think we missed.
if (!PyErr_Occurred()) { | ||
PyErr_SetString(PyExc_ValueError, "bug, should not happen"); | ||
} | ||
return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return 0; | |
return -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, guess the caller doesn't support this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In which case, you'll need to call PyErr_Clear()
here instead.
I've a local branch that tries to propagate errors out of here, but it's a bit invasive, and less important than this patch.
Py_INCREF(type1); | ||
return type1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test for this behavior? In that if two dtypes with different padding are specified, the first/last one is returned by identity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the way I wrote it using "equivalent_fields" this rejects promotion for dtypes with different padding/offsets. Probably that's safer anyway so no-one gets their padding changed unexpectedly.
Added a test for that case.
85624c5
to
08a6460
Compare
487825f
to
320fdef
Compare
Pushing off to 1.20.x. |
d89df8f
to
1c2cf51
Compare
Ping all, and rebase needed. |
This needs more than a rebase now :), there is an additional code path here on master which would also need updating. Happy to review! But unless pinged explicitly, I will not prioritize this until the code that the current PR touches is deleted :). |
@seberg, just to be clear, you recommend I wait before rebasing until you've completed more of the dtype updates? I'm looking through the new master code now, I can see you have some TODOs in the code related to this PR. |
@ahaldane yeah, I specifically marked them. I am happy to rebase it now, it probably isn't much harder; We just have to change both the new and the old version (which probably just means keep the changes here and do the same changes in the new version). |
This should go in after #18676 (it will need some fixups) |
This PR replaces the old numpygh-15509 implementing proper type promotion for structured voids. It further fixes the casting safety to consider casts with equivalent field number and matching order as "safe" and if the names, titles, and offsets match as "equiv". The change perculates into the void comparison, and since it fixes the order, it removes the current FutureWarning there as well. This addresses liberfa/pyerfa#77 and replaces numpygh-15509 (the implementation has changed too much). Fixes numpygh-15494 (and probably a few more) Co-authored-by: Allan Haldane <allan.haldane@gmail.com>
Closing, gh-19226 is the equivalent change taking into account that all of this has changed quite a lot. (Unfortunately, there is currently still a small API addition missing for DTypes to finish the puzzle.) |
This PR replaces the old numpygh-15509 implementing proper type promotion for structured voids. It further fixes the casting safety to consider casts with equivalent field number and matching order as "safe" and if the names, titles, and offsets match as "equiv". The change perculates into the void comparison, and since it fixes the order, it removes the current FutureWarning there as well. This addresses liberfa/pyerfa#77 and replaces numpygh-15509 (the implementation has changed too much). Fixes numpygh-15494 (and probably a few more) Co-authored-by: Allan Haldane <allan.haldane@gmail.com>
This PR replaces the old numpygh-15509 implementing proper type promotion for structured voids. It further fixes the casting safety to consider casts with equivalent field number and matching order as "safe" and if the names, titles, and offsets match as "equiv". The change perculates into the void comparison, and since it fixes the order, it removes the current FutureWarning there as well. This addresses liberfa/pyerfa#77 and replaces numpygh-15509 (the implementation has changed too much). Fixes numpygh-15494 (and probably a few more) Co-authored-by: Allan Haldane <allan.haldane@gmail.com>
This PR replaces the old numpygh-15509 implementing proper type promotion for structured voids. It further fixes the casting safety to consider casts with equivalent field number and matching order as "safe" and if the names, titles, and offsets match as "equiv". The change perculates into the void comparison, and since it fixes the order, it removes the current FutureWarning there as well. This addresses liberfa/pyerfa#77 and replaces numpygh-15509 (the implementation has changed too much). Fixes numpygh-15494 (and probably a few more) Co-authored-by: Allan Haldane <allan.haldane@gmail.com>
This PR replaces the old numpygh-15509 implementing proper type promotion for structured voids. It further fixes the casting safety to consider casts with equivalent field number and matching order as "safe" and if the names, titles, and offsets match as "equiv". The change perculates into the void comparison, and since it fixes the order, it removes the current FutureWarning there as well. This addresses liberfa/pyerfa#77 and replaces numpygh-15509 (the implementation has changed too much). Fixes numpygh-15494 (and probably a few more) Co-authored-by: Allan Haldane <allan.haldane@gmail.com>
This PR replaces the old numpygh-15509 implementing proper type promotion for structured voids. It further fixes the casting safety to consider casts with equivalent field number and matching order as "safe" and if the names, titles, and offsets match as "equiv". The change perculates into the void comparison, and since it fixes the order, it removes the current FutureWarning there as well. This addresses liberfa/pyerfa#77 and replaces numpygh-15509 (the implementation has changed too much). Fixes numpygh-15494 (and probably a few more) Co-authored-by: Allan Haldane <allan.haldane@gmail.com>
This PR replaces the old numpygh-15509 implementing proper type promotion for structured voids. It further fixes the casting safety to consider casts with equivalent field number and matching order as "safe" and if the names, titles, and offsets match as "equiv". The change perculates into the void comparison, and since it fixes the order, it removes the current FutureWarning there as well. This addresses liberfa/pyerfa#77 and replaces numpygh-15509 (the implementation has changed too much). Fixes numpygh-15494 (and probably a few more) Co-authored-by: Allan Haldane <allan.haldane@gmail.com>
Fixes #15494
The fix seems fairly painless.
I also considered if we should restrict things even more, but didn't do so. For instance, what about if the field names and order are the same, but the byte-offsets of the fields are different. Which dtype "wins"?
Eg:
The behavior shown here is actually the same as the behavior from 1.13 and before (with assign-by-fieldname), where the last dtype would be used in such cases, just like here. Since that is old behavior, I opted to leave it alone.