-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
BUG: ptp fails on bool #15057
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Workaround for bool data when using ptp Also silences pandas ptp warning in this location xref numpy/numpy#15057
Would you expect the result to be a float or bool? |
Workaround for bool data when using ptp Also silences pandas ptp warning in this location xref numpy/numpy#15057
I'm happy either way as long as I can test using ptp(x, axis=0) == 0. Probably a bool, and so True/False. I assume ptp(x.astype(np.float32), axis=0) is a float32. |
Type seems to be preserved by ptp except where it spills over (int types). |
I'd argue that the current behavior is correct - ptp is not defined over things on which subtraction isn't defined. Having said that, the operation I'd argue that |
FWIW: When I was testing various ways to check that an array is constant earlier this year, the fastest turned out to be |
Perhaps it would be worth adding to numpy somewhere (or perhaps a C-optimized version of the following): def all_same(arr, *, axis):
axis = normalize_axis_tuple(axis, arr.ndim)
sl = [slice(None)]*arr.ndim
for ax in axis:
sl[axis] = slice(None, 1)
return (x == x[tuple(sl)]).all(axis=axis) Note by using |
@eric-wieser The original justification for ptp was that it should be about 2x as fast as using max and min. I think the implementation in NumPy isn't and is actually doing two separate passes. @WarrenWeckesser The potential issue there is that it requires creating an intermediate array that has the same dimension as x. If x is large, then this array will also be large. The ptp method uses a reduction first, and then does creates an intermediate with the same size as the reduction. |
That's a fair point - although at least the array is a boolean so typically 4-8x smaller than x |
For me, the reduction dimension is negligible relative to the shape of the axis being reduced (1000-10000000x smaller), so it is pretty negligible. A loopy implementation of |
In terms of thinking about the choice here, it is hard to understand by why some ops, e.g., Put another way, I don't really understand how |
This seems to be a rabbit hole where it is hard to be fully consistent. For example
|
Workaround for bool data when using ptp Also silences pandas ptp warning in this location xref numpy/numpy#15057
I don't know the details, but there is a lot of relevant history here, including some churn back in version 1.13 when the deprecation of boolean subtraction that was added in 1.9 (I think) was converted to an error (e.g. #9251, #9255); see the links below. Digging into that history is probably necessary to explain how we ended with the current behavior. Can someone familiar with the details and history comment here? Is there more to do, or is the current behavior the best compromise at the moment (and therefore this issue should be closed)? Edit:
|
After this thread, I went with something like def all_same(arr, *, axis):
_max = arr.max(axis=axis)
_min = arr.min(axis=axis)
return _max == _min which is what was originally suggested. Seems to work correctly for |
@bashtage nice. Sounds like we can close this then. I personally think it is OK if |
It would be nice if there was some more consistency for booleans w.r.t when they are treated as 1 and normal int math appears or when adding is treated as |
Yes, I agree, but we are in a tricky position here. Python traditionally treats booleans exactly like 0 and 1 (they are just singletons of a subclass of int). We have been stricter, because it is even stranger in NumPy world (e.g. with limited dtype precision and what happens if you index with a boolean). But we cannot decide that |
Reproducing code example:
Error message:
TypeError: numpy boolean subtract, the
-
operator, is deprecated, use the bitwise_xor, the^
operator, or the logical_xor function instead.Numpy/Python version information:
The text was updated successfully, but these errors were encountered: