-
-
Notifications
You must be signed in to change notification settings - Fork 11.3k
BUG: allow MaskedArray min/max with dtype=object #29576
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Tests the case of arrays of 0-D arrays, and check for identity rather than equality.
I haven't thought too much more about this, but if the discussion above is suggesting that this PR doesn't fix the reproducer in the original ticket, it might be sensible to at least not have this PR set to automatically close/address the ticket, etc. Then the scope might be clearer/a bit less confusing--it may very well be just fine to address the separate issue here without closing that ticket. |
That's reasonable. I changed it to no longer close #9103. |
axis=axis, out=out, **kwargs).view(type(self)) | ||
if result.ndim: | ||
axis=axis, out=out, **kwargs) | ||
if isinstance(result, ndarray) and result.ndim: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an (extreme) edge case I am aware of here that my code fails to address: ideally, I think that
x = np.array([1, 2])
assert np.fromiter([x], dtype=object).view(np.ma.MaskedArray).min() is x
should pass, but with my change, x will get erroneously converted to a masked array when returned by min
.
This is identical to the old behavior, so it's not necessarily a problem, but it is a failure of my assumptions: initially when working on this PR, I thought I only needed to worry about arrays containing 0-D array objects, since arrays of higher dimension are not orderable (and so you'd just get an error when calling min()
, as you should). But of course I forgot to consider that min()
will succeed regardless if there is only one item. Also, as another similar edge case, higher dimensional arrays actually are orderable if they themselves only contain one item (not just if they are 0-D).
I'm not sure if this is actually worth fixing though. Thoughts?
Adds _UniversalInfinity object which (preferentially) compares greater than all other objects other than itself. This is used as the default min_filler. Previously, min filled masked values with the string '?', which caused a TypeError when the array contained objects that could not be compared with str. Likewise, negative universal infinity is used as the max_filler. See numpy#9103.
NOTE: The reproducer given in numpy#9103, although reported as producing an AttributeError, actually produced a TypeError. The AttributeError was a secondary issue which occurs when there is no mask. Because there were 2 bugs associated with the issue (one of was unnoticed in my previous attempt to fix the issue), we test the reproducer specifically just as confirmation that the issue is fully resolved and I'm not missing anything else. This test *should* be redundant with the added test_minmax_object_dtype tests, but it's not exactly the same.
I have added my proposed fix for the |
Fixes #9103. There are actually 2 bugs here:
AttributeError
is fixed by checking if the result of min/max is anndarray
before calling.view()
.TypeError
(which is what you actually get when you run the #9103 reproducer, even though it is reported as the AttributeError) is fixed by adding a universal infinity filler object which (preferentially) compares greater than all other objects other than itself.In the process of fixing this bug, I also found several other related issues by searching for uses of
.view()
. I was initially planning to fix those in this PR, but I think some of them may require a bit more discussion. I plan to open a new issue about that later.