-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Document and test what "array like" means to Matplotlib #22879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I would expect that chain to unify/modify/improve over time, see #22126 and #22645. Right now, I believe that there is a bit of special handling in different methods/functions so we should probably make sure that "all" methods/functions support data in the same format.
|
see matplotlib/matplotlib#22973, matplotlib/matplotlib#22879, and matplotlib/matplotlib#22560 It is not clear to me as to which is the standard interface for unit handling (for eg, hist still doesn't handle unit by default)
Here's a suggestion:
@runtime_checkable
class SupportsArray(Protocol):
"""Protocol for objects that support `__array__`."""
@abstractmethod
def __array__(self) -> NDArray[np.object_]:
"""Return the array of the tensor."""
... Many libraries (numpy/pandas/pyarrow/polars/pytorch/...) support the def func(x: SupportsArray | ArrayLike):
if isinstance(x, SupportsArray):
arr = x.__array__()
else:
try:
arr = numpy.array(x)
except TypeError as exc:
raise TypeError(f"Cannot convert {x} to array.") from exc So |
(I just wanted to open an issue on the conversion and propose a PR, as I came accross this issue) I would second the protocol method: just ran into the issue that TensorFlow and JAX arrays are not converted (but they have both the The reason for the protocol (but not excluding more): it's exactly what is wanted. Numpy defines that this is the way to convert it, so it's also an easy way for anyone to implement this. The strongest point to add this "feature" sooner than later: it's fully backwards compatible The possible issue with the "brute-force" conversion: could work, but maybe creates arrays of objects (if lists or other objects in other lists etc). If it's well documented behavior, sure! But it could break current use-cases. |
Note that Jax and torch arrays will be converted in the upcoming 3.9 release. See #25887 |
And as discussed in e.g. #25887, the reason that |
@oscargus What units are we talking about here? I am not aware of widely used python packages that add units (like e.g. Julia's Unitful.jl does), I guess there is |
Hm, that seems like a suboptimal solution? as @timhoffm also commented there, should we therefore rather strive for a general case or should I add TensorFlow separately? (-> I'll open a separate issue then)
This argument was never convincing to me: there is a general case that libraries adhere to, standards: the |
@randolf-scholz time is definitely one use case, but there is support for defining your own unit handling. Not sure if pint is supported directly though (been primarily involved as accidentally breaking unit support every now and then...). See https://matplotlib.org/stable/gallery/units/index.html for some idea how it works (not that exciting examples though, and some have disabled units I notice). There's been a recent push to figure out where unit actually is supported #26864, and, yes, primarily time is considered in the use cases, but there should be support in general. There's also a specific issue topic: topic: units and array ducktypes (There are people knowing the details to a much larger extent than I do.) |
As a quick solution, please add a special case for tensorflow as well similar to JAX/torch if needed.
If starting from scratch, this would indeed be a no-brainer. However Matplotlib's unit support predates all those standards and is deeply baked into the library. I don't have a full oversight (and not sure if anybody has), but changing the logic and implementation is likely a medium to major-scale effort, given that we need to maintain compatibility with existing code. |
I personally think we should work with downstream to make this work properly. The only tests that fail are the jpl_unit tests, and it should be relatively trivial to refactor their mock object to do the right thing. The real jpl could update their array class, and astrophysics and pint could as well. It's a little infuriating that we are ignoring community standards on our basic functionality to support our inconsistent and half-baked units implementation. |
We have always said we take "array like" input, however when we started saying that there was numarray/numeric, then numpy, and always Python lists. Later we started to have some special cases to support
pandas.Series
(both on straight conversion and to pick of the index). However, the the current rise of sturcutred and array-like libraries (xarray, akwaardarray, pandas recent changes, the fleet of GPU libraries, dask, the pandas-likes, ...) things have become a bit fuzzier and different parts of the library are not completely consistent with what they support.There has been work recently (most recently #22560) to consolidate all of the attempted coercion into one place in the code and articulate exactly what we mean by "array like".
We have standardized on the chain of:
We should: document this chain as what it means to "be array like" including including the expected shapes / squashing of these things (e.g. when do we expect (n,) vs (1, n) vs (n, 1) shaped arrays).
If we do this we may want to make
_unpack_to_numpy
and maybe our 1D and 2D coercion function public so that the datastructure libraries can test if they comply. We should, however, make clear in the documentation that the API garuntees on these functions will be a bit weaker in that we will reserve the right to make the more permissive in the future (that is input that we currently bounce may be handled in the future with no warning), but anything that we correctly get to numpy will continue to work (or go through the normal deprecation cycle).If we do this we should also provide a set of numpy arrays and/or lists and the expected return through the conversion as a fixture.
The text was updated successfully, but these errors were encountered: