Skip to content

Minor fix for astropy units support broken in earlier PR #23141

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 27, 2022

Conversation

oscargus
Copy link
Member

PR Summary

See #22929 (comment)

This is not possible to test without importing astropy, but if I only had selected this approach first, no one would have objected. (Our Quantity-mock class does not show this problematic behavior.)

Thanks @greglucas for suggesting the very simple solution (simple enough so that I didn't believe it would solve it...).

For a description of why the earlier approach using zeros_like didn't work see numpy/numpy#21603

PR Checklist

Tests and Styling

  • [N/A] Has pytest style unit tests (and pytest passes).
  • Is Flake 8 compliant (install flake8-docstrings and run flake8 --docstring-convention=all).

Documentation

  • [N/A] New features are documented, with examples if plot related.
  • [N/A] New features have an entry in doc/users/next_whats_new/ (follow instructions in README.rst there).
  • [N/A] API changes documented in doc/api/next_api_changes/ (follow instructions in README.rst there).
  • [N/A] Documentation is sphinx and numpydoc compliant (the docs should build without error).

@jklymak
Copy link
Member

jklymak commented May 26, 2022

I think it needs a test, or we can break again. Unless you think the Quantities behaviour is a bug, in which case we should bounce the issue back to them.

@oscargus
Copy link
Member Author

Well, the "bug" is that np.zeros_like returns a copy of the same object and then fills it with zeros, rather than zeros that creates a new nparray object of the same size (unless the like-parameter is passed, when the type can be set to something else) and then fills it with zero.

This was not clear from the documentation. So any use of np.*_like should be avoided in Matplotlib for things where there may be Quantity involved (and possible other classes). There are 88 cases in the current code base, although maybe half are in tests and examples and so on.

Not sure how to modify our Quantity class to get this behavior though. I've spent some time trying to figure out why it is different, but it seems to boil down to how numpy decides to copy the array.

Here is a simple example to show the problem:

from matplotlib.tests.test_units import Quantity
import astropy.units as u
import numpy as np

mplq = Quantity([1, 2, 3], 'm')
q = [1, 2, 3]*u.m
print(repr(np.zeros_like(mplq, dtype=bool)))
print(repr(np.zeros_like(q, dtype=bool)))

So we would like our Quantity to still return a Quantity object after np.zeros_like for the test to make sense. Right now, testing with our Quantity will work even with the earlier code.

@oscargus
Copy link
Member Author

So any use of np.*_like should be avoided

Let me rephrase that: any use of np.*_like combined with setting dtype should be avoided.

There is only one use of that, apart from this, namely in image.py. This may (or may not) be the limiting factor of having units in images, but it is not necessarily and issue, only combined with e.g. np.less where values are set.

@jklymak
Copy link
Member

jklymak commented May 26, 2022

OK, this is fine. However, my complaint stands that this should be dealt with earlier than this in the processing of the inputs so that we don't keep having to squash issues like this.

@oscargus
Copy link
Member Author

oscargus commented May 26, 2022

I dug a bit more into it and astropy Quantity inherits from np.ndarray. Doing that to our Quantity mock leads to a crash in the code example above with what looks like an infinite recursion.

Redefining __deepcopy__ does not help either, although the hint in astropy may indicate it is involved:
https://github.com/astropy/astropy/blob/f0e2129aad460ea0db00730f10f98fe799aefb2f/astropy/units/quantity.py#L778-L781

I do not disagree that units should be handled differently. Just that I have very limited insight into it.

It would indeed make sense to get our mock Quantity to behave the same for this case to make the transition to another approach easier.

@pllim
Copy link

pllim commented May 26, 2022

I am going to cc @mhvk here too in case he has insights. Thanks!

@dopplershift
Copy link
Contributor

dopplershift commented May 26, 2022

I'll only add that the Quantity class as it exists right now has been crafted to make sure that it works like pint.Quantity (and thus tests that).

Instead of changing it, I'd advise making a different mock for astropy, and possibly make the tests parameterized around the mocks.

@oscargus
Copy link
Member Author

Thanks for the info @dopplershift !

I can confirm that pint.Quantity does not have this issue.

import numpy as np
import pint
q = pint.Quantity([1, 2, 3], 'm')
print(repr(np.zeros_like(q, dtype=bool)))

shows that the results of np.zeros_like is an np.array.

So something is slightly different for astropy.Quantity.

@dopplershift
Copy link
Contributor

Quantity/pint.Quantity is intentionally not an ndarray subclass due to the challenges of getting the subclassing right, and more importantly the complete inability to combine ndarray subclasses.

Pretty sure by design zeros_like (and friends) preserves the type for any subclass it is given.

@tacaswell tacaswell added the Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. label May 26, 2022
@oscargus
Copy link
Member Author

Pretty sure by design zeros_like (and friends) preserves the type for any subclass it is given.

I agree. Thought it was a convenience method for getting the same shape, not subclass, hence the introduced problem.

I'll try and make a second Quantity mock that subclasses ndarray and see if that has the problematic behavior.

@oscargus
Copy link
Member Author

Nah, it is possible to subclass (by getting rid of _getattr_), but to get it to properly fail while assigning values that does not have units, one would need to also introduce a mock unit class.

I, again, suggest that we should instead extend the weekly upcoming dependency tests
with more optional upstream and downstream libraries.

So I think that this is good to go as is.

@jklymak
Copy link
Member

jklymak commented May 27, 2022

Maybe pint is light weight enough to add as a test requirement for unit tests.

@oscargus
Copy link
Member Author

pint is 300 k, so should not be a problem. astropy (which causes the issue here) is 6-7 MB, so probably not worth it.

@jklymak
Copy link
Member

jklymak commented May 27, 2022

According to the above, pint and astropy are basically the same wrt the Quantity class?

@pllim
Copy link

pllim commented May 27, 2022

pint and astropy are basically the same

AFAIK they are unrelated? astropy.units came from pynbody.

@jklymak
Copy link
Member

jklymak commented May 27, 2022

Oh I'm sorry I misunderstood @dopplershift OTOH we should check the blame.

@oscargus
Copy link
Member Author

According to the above, pint and astropy are basically the same wrt the Quantity class?

If it is my statements you are referring to, I must have been unclear. However, I think I now have a much briefer summary:

  • pint.Quantity behaves as mpl.Quantity when it comes to np.*_like and works with current main.
  • astropy.Quantity does not behave as mpl.Quantity when it comes to np.*_like and does not work with current main. I think this is related to astropy.Quantity inheriting from np.ndarray, so the result of np.*_like is the subclass, not an np.ndarray.

@mhvk
Copy link
Contributor

mhvk commented May 27, 2022

Indeed, astropy's Quantity has an __array_function__ override that ensures np.zeros_like produces a Quantity; in principle, this should be standard behaviour also for non-subclasses that aim to mimic an array (e.g., np.zeros_like(dask_array) returns a dask_array). For creating a mask with the same shape, one would probably be better off doing np.zeros(q.shape, dtype=bool).

@dopplershift dopplershift merged commit 79f28ec into matplotlib:main May 27, 2022
@oscargus oscargus deleted the nanbarlabel2 branch May 27, 2022 19:12
@pllim
Copy link

pllim commented May 27, 2022

The job that was failing with matplotlib dev now passes over at astropy. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants