-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
WIP, BUG: preserve endianness of concat #21319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
if (single_byteorder == 1 && dtype == NULL) { | ||
descr->byteorder = initial_byteorder; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These 3 lines are the ones that cause somewhat-stochastic test failures in the full suite, that in my hands are not reproducible when testing those individuals tests (or even their entire modules) in isolation on this branch. If they are commented out, then the new test I added fails, but everything else passes.
Reference counting or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that descr
is a singleton (or can be). You would have to create a new dtype and swap bytes. (There should be a NewByteOrder
function for that I think).
@@ -382,6 +382,9 @@ PyArray_ConcatenateArrays(int narrays, PyArrayObject **arrays, int axis, | |||
NPY_CASTING casting) | |||
{ | |||
int iarrays, idim, ndim; | |||
char initial_byteorder; | |||
char iterative_byteorder; | |||
int single_byteorder = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
smaller integer/bool type could be used here maybe?
if (iterative_byteorder != initial_byteorder) { | ||
single_byteorder = 0; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we already have loops that do this iteration, and I did have success fusing this into those loops if that is preferred; obviously it doesn't change the asymptotic complexity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(we could also break
early I suppose, if that condition is ever satisfied)
* add a regression test for the described issue, and C code changes to make it pass * one curious problem, likely at the C level: this causes a few test failures in the full test suite, but none of them are reproducible if you tell `pytest` to run them in isolation; also, which tests fail is somewhat stochastic--reference counting issue maybe? * other things it may be useful to consider: * most sensible behavior when combining arrays with different endianness when `out` and/or `dtype` are not specified (I've assumed this is covered by the tests already in place, though quite possible it is not) * if we want to do this, need for `versionchanged` directive? maybe just in `concatenate` but not the functions that leverage `concatenate`? * run the tests on a big endian machine on the gcc compile farm to be safe
c296ead
to
15be7d2
Compare
Are we sure we want to preserve endianess here? My feeling would be that the only thing that I would seriously considering is to preserve the In general, I am not sure I feel it makes sense. Users are normally better of with native-byte order. Concatenate always copies the data, so swapping bytes in the concat operation probably makes sense for most users. That is, unless they really rely on byte-order. |
Ok, there isn't much discussion in the issue, and it isn't necessarily labeled a bug I don't think, so I guess I'm not sure what tests I should be aiming to make pass just yet. |
There is a bit more discussion around this topic in gh-15088. I do think for |
Is it best to close this and mention a summary of your points in the matching issue for now? |
Yeah, let me close it then. Thanks Tyler. |
Fixes #7829
add a regression test for the described issue,
and C code changes to make it pass
one curious problem, likely at the C level:
this causes a few test failures in the full test suite,
but none of them are reproducible if you tell
pytest
to run them in isolation; also, which tests fail is
somewhat stochastic--reference counting issue maybe?
other things it may be useful to consider:
endianness when
out
and/ordtype
are not specified (I've assumedthis is covered by the tests already in place, though quite possible it is not)
versionchanged
directive? maybejust in
concatenate
but not the functions that leverageconcatenate
?safe