Skip to content

Fix calls to np.ma.masked_where #20511

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 5, 2021

Conversation

QuLogic
Copy link
Member

@QuLogic QuLogic commented Jun 24, 2021

PR Summary

Due to numpy/numpy#18967, I've audited all our calls to np.ma.masked_*(copy=False). When the input comes directly from the user, or is already stored somewhere already, then we don't want these calls to modify the existing values.

This affects np.ma.masked_where and all simple wrappers around it like masked_less_equal. It does not affect masked_invalid, which continues to not modify the input. I have asked upstream numpy/numpy#19332 whether that is intended. If they intend to change masked_invalid also, then we will need to make a few more changes like this, but we are mostly okay.

PR Checklist

  • Has pytest style unit tests (and pytest passes).
  • Is Flake 8 compliant (run flake8 on changed files to check).
  • [n/a] New features are documented, with examples if plot related.
  • [n/a] Documentation is sphinx and numpydoc compliant (the docs should build without error).
  • Conforms to Matplotlib style conventions (install flake8-docstrings and run flake8 --docstring-convention=all).
  • [n/a] New features have an entry in doc/users/next_whats_new/ (follow instructions in README.rst there).
  • [n/a] API changes documented in doc/api/next_api_changes/ (follow instructions in README.rst there).

For the same reason as Quiver in matplotlib#4250.
@QuLogic QuLogic added this to the v3.4.3 milestone Jun 24, 2021
@QuLogic QuLogic force-pushed the fix-ma.masked_where branch from 076c4e6 to 32d100e Compare June 24, 2021 23:01
@@ -1545,11 +1545,11 @@ class LogNorm(Normalize):

def autoscale(self, A):
# docstring inherited.
super().autoscale(np.ma.masked_less_equal(A, 0, copy=False))
super().autoscale(np.ma.array(A, mask=(A <= 0)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this save the copy step? It seems like this is creating a new array anyways, so I'm wondering if it is gaining anything relative to just removing the copy=False?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default for np.ma.array is copy=False, keep_mask=True. The former will not copy the data, and the latter will use the existing mask (if any) with the new mask, which will make a copy. Removing copy=False from masked_less_equal makes a copy of both.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying! I was curious how to test if this actually holds and learned about np.shares_memory today. I did verify this assertion with that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also check, if you have b = np.ma.array(a), that b.base is a (if a is originally a plain array), or b.base is a.base (if a is originally a masked array).

@QuLogic QuLogic force-pushed the fix-ma.masked_where branch from 32d100e to d733f9f Compare June 29, 2021 08:56
NumPy 1.21.0 fixed a bug with `np.ma.masked_where` where `copy=False`
now modifies the input if it is masked, which we do not want to do.
`np.ma.array` defaults to not copying the data, but copies the mask
(adding the new mask), which is what we wanted with `copy=False`.
@QuLogic QuLogic force-pushed the fix-ma.masked_where branch from d733f9f to 48eef46 Compare June 29, 2021 09:00
@dstansby dstansby merged commit 33beff9 into matplotlib:master Jul 5, 2021
meeseeksmachine pushed a commit to meeseeksmachine/matplotlib that referenced this pull request Jul 5, 2021
@QuLogic QuLogic deleted the fix-ma.masked_where branch July 5, 2021 21:37
QuLogic added a commit that referenced this pull request Jul 6, 2021
…511-on-v3.4.x

Backport PR #20511 on branch v3.4.x (Fix calls to np.ma.masked_where)
@jorisvandenbossche
Copy link

@QuLogic a small note that I think this change can cause a bug when the array A is not an array but a list. In np.ma.masked_less_equal(A, 0, copy=False), the A as input to the function is converted an array, while in np.ma.array(A, mask=(A <= 0)) it already needs to be an array up front (otherwise A <= 0 will fail).

We ran into this with GeoPandas (geopandas/geopandas#2066, didn't test it, but I assume this PR is the cause), where we were (incorrectly) setting the array of a color map with a list cmap.set_array(np.array([])).
This was of course wrong to do, as the docs clearly say it should be an ndarray (although in general people might expect that functions needing an array also take general "array-likes", as do most functions in numpy).

Just noting it here, I'll let it to your judgement whether this is worth addressing or not (it's certainly not needed just for GeoPandas).

@greglucas
Copy link
Contributor

Just a note that I am pretty sure this is fixed in master due to a new call to a call to cbook.safe_masked_invalid() in set_array() now.

A = cbook.safe_masked_invalid(A, copy=True)

v3.4.3

>>> sm.set_array([])
>>> sm.get_array()
[]

master

>>> sm.set_array([])
>>> sm.get_array()
masked_array(data=[],
             mask=False,
       fill_value=1e+20,
            dtype=float64)

I don't know if another 3.4.x release is planned sometime, but it should be a pretty simple fix to add that call in to cast to array immediately in the set_array call.

@QuLogic
Copy link
Member Author

QuLogic commented Aug 23, 2021

That came in from #18870, which I don't think we can backport fully. But we can make a small correction to this, as it's not required that autoscale must be called internally only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants