Skip to content

BUG: Fix np.testing utils failing for masked scalar vs. scalar (#29317) #29318

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

danra
Copy link
Contributor

@danra danra commented Jul 4, 2025

Resolves #29317

Fixes np.testing.assert_XXX utilities failing when comparing a masked scalar against another scalar, with a "nan location mismatch" error. The consistent behavior is to ignore any masked elements, in this case the only one, and therefore succeed.

See also #11121

danra added 5 commits July 3, 2025 22:15
The two added test cases fail with:

E       AssertionError:
E       Arrays are not equal
E
E       nan location mismatch:
E        ACTUAL: MaskedArray(3.)
E        DESIRED: array(3.)

and

E       AssertionError:
E       Arrays are not equal
E
E       nan location mismatch:
E        ACTUAL: MaskedArray(3.)
E        DESIRED: array(nan)
…#29317)

TestArrayEqual.test_masked_scalar now passes.

This case regressed since 7315145 (merged in numpy#12119) due to:
- `<masked scalar> == <scalar>` returning np.ma.masked (not a 0-dim
   masked bool array), followed by
- `np.bool(np.ma.masked)` unintentionally converting it to np._False

There are a few ways to resolve this; I went with testing the
comparison result with `isinstance(bool)` to check if a conversion to
array is necessary, which is the same approach already taken in
assert_array_compare after evaluating `comparison(x, y)`.
Confusingly, "isinstance(..., bool) checks" in the previous wording
actually incorrectly referred to the ones towards the end of the
function, which are not actually related to __eq__'s behavior but to
the possibility of `func` returning a bool.
- Use same language as elsewhere below to explain `!= True` used to
  handle np.ma.masked
- Clarify committed to support standard MaskedArrays
- Restore note lost in 7315145 comment changes about how the
  np.bool casts towards the end of the function handle np.ma.masked,
  and expand further.
@ngoldbaum
Copy link
Member

What is currently broken? What does this PR change?

I think the reason this PR hasn't been reviewed yet is because it has a blank description and does nontrivial things that aren't obvious just from the PR title.

@danra
Copy link
Contributor Author

danra commented Jul 7, 2025

What is currently broken?

Described in the linked bug: np.testing.assert_XXX utilities fail when comparing a masked scalar against another scalar.

it has a blank description

The title says that it fixes the bug though... I moved a later comment "Resolves #29317" into the description, if that's helpful.

@ngoldbaum
Copy link
Member

Described in the linked bug: np.testing.assert_XXX utilities fail when comparing a masked scalar against another scalar.

I don't think the np.testing functions are designed to work with masked arrays at all, see e.g.: #14624.

I'm not sure it makes sense to do one-off fixes like this, especially this one that does a cast based on the result of an isinstance call, which feels pretty brittle to me.

Much better IMO to add e.g. a new np.ma.testing module that intentionally accepts masked arrays and handles all these cases, just like we have for all the other masked operations.

Sadly masked arrays are somewhat of an unloved feature no one as far as I'm aware it working on any systematic fixes for them like this.

@danra
Copy link
Contributor Author

danra commented Jul 7, 2025

Thanks for the reference to #14624.

Personally, I agree it makes more sense to have a separate np.ma.testing module. However, it looks like there have been numerous commits to the current testing module to support masked arrays; since those had been accepted and merged, I took it as a given that that support does need to be kept in case I make any changes. See for example #11121. But, if there's consensus on that direction, then I'd be happy to make such a split to simplify np.testing even though that's more work than I'd planned :)

I've made these one-off changes not for their own sake, but as incremental fixes towards #28827. I'd described that in some more detail under the Context section of #29317:

Ran into this while working on #28827
- while amending assert_array_compare's nan/inf handling to be able to completely remove the additional inf handling in assert_array_almost_equal
- which is needed to avoid assert_array_almost_equal.compare potentially producing an output shape different from its input
- which in turn would avoid additional logic to produce the correct indices of the max abs/rel errors in the original array.

@ngoldbaum
Copy link
Member

ping @mhvk since you merged a fix a while back along these lines - what do you think?

# support them if possible.
if np.bool(x_id == y_id).all() != True:
result = x_id == y_id
if isinstance(result, bool):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels pretty brittle to me. Is there another way to do this without isinstance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other ways, but have you seen the rationale in the commit message for d1963d9 ?

There are a few ways to resolve this; I went with testing the
comparison result with isinstance(bool) to check if a conversion to
array is necessary, which is the same approach already taken in
assert_array_compare after evaluating comparison(x, y).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference here is that logic has been there a long time, so people have written code to handle it. In this case, you're introducing a possibility for NumPy to suddently generate AttributeErrors in downstream tests when NumPy didn't do that before.

result = x_id == y_id
if isinstance(result, bool):
result = np.bool(result)
if result.all() != True:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Playing devil's advocate: what if someone passes something into this routine that returns something that's not a bool from == but that also doesn't have an all() method. You've introduced a new AttributeError that the cast to np.bool used to guard against.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: np.testing utilities fail for masked scalar vs. scalar
2 participants