-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Handle NaN in bar labels and error bars #22929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3a195da
to
7ada6b8
Compare
@@ -3504,7 +3504,9 @@ def apply_mask(arrays, mask): return [array[mask] for array in arrays] | |||
f"'{dep_axis}err' (shape: {np.shape(err)}) must be a " | |||
f"scalar or a 1D or (2, n) array-like whose shape matches " | |||
f"'{dep_axis}' (shape: {np.shape(dep)})") from None | |||
if np.any(err < -err): # like err<0, but also works for timedelta. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you check for the nan in the conditional?
(err < -err) & ~np.isnan(err)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isnan
doesn't work for dtype=object
https://stackoverflow.com/questions/36198118/np-isnan-on-arrays-of-dtype-object and casting to float will break timedelta
.
There may be a better way to write it, but that is the reason for the current approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe & err != err
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, there will be runtime warnings that will fail the tests.
err = np.array([np.nan, -0.1], dtype=object)
np.any(err < -err)
gives
/tmp/ipykernel_32424/1840248036.py:1: RuntimeWarning: invalid value encountered in less
np.any(err < -err)
independent of what is added (unless something like err != err
is added before, but then it will not test properly).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(It took me quite some time to get a solution that worked... Not that it may be a good indication, but at least I failed with quite a few alternatives...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it is numpy? Although with dtype=object. Or am I missing something?
Yeah, but dtype object isn't guaranteed to do anything under mathematical operations (of particular relevance <
is not guaranteed to work). Its basically just a list.
The PR solves #22910 without introducing any new issues (as far as I can tell, all existing tests pass anyway). Although it may be that lots of other related things also should be fixed.
Fair, maybe this is a tangent of sorts, but I think we need to fix the underlying problems rather than add more bandaids on top.
It would really help to get some examples of things that doesn't work though.
I'm 99% sure that if you passed a unit-full pint object in as err
this would probably fail. Maybe it failed before as well?
I'm not blocking on this, but just pointing out that we still have substantial work to do on cleaning up the engineering here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it is bandaid for sure.
Thanks! Wasn't aware of pint (although I've seen it mentioned). This is what happens on both main
and this when using errorbar
:
/local/data1/matplotlib/lib/matplotlib/axes/_axes.py:3329: UnitStrippedWarning: The unit of the quantity is stripped when downcasting to ndarray.
y = np.asarray(y, dtype=object)
/local/data1/matplotlib/lib/matplotlib/axes/_axes.py:3374: UnitStrippedWarning: The unit of the quantity is stripped when downcasting to ndarray.
return np.asarray(err, dtype=object)
and when using bar
:
/local/data1/miniconda3/lib/python3.8/site-packages/numpy/lib/stride_tricks.py:536: UnitStrippedWarning: The unit of the quantity is stripped when downcasting to ndarray.
args = [np.array(_m, copy=False, subok=subok) for _m in args]
Traceback (most recent call last):
File "/local/data1/miniconda3/lib/python3.8/site-packages/pint/quantity.py", line 901, in __float__
raise DimensionalityError(self._units, "dimensionless")
DimensionalityError: Cannot convert from 'meter' to 'dimensionless'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/tmp/ipykernel_13475/2206117657.py", line 1, in <cell line: 1>
plt.bar([1, 2], base, yerr=err)
File "/local/data1/matplotlib/lib/matplotlib/pyplot.py", line 2392, in bar
return gca().bar(
File "/local/data1/matplotlib/lib/matplotlib/__init__.py", line 1412, in inner
return func(ax, *map(sanitize_sequence, args), **kwargs)
File "/local/data1/matplotlib/lib/matplotlib/axes/_axes.py", line 2369, in bar
x, height, width, y, linewidth, hatch = np.broadcast_arrays(
File "<__array_function__ internals>", line 5, in broadcast_arrays
File "/local/data1/miniconda3/lib/python3.8/site-packages/numpy/lib/stride_tricks.py", line 536, in broadcast_arrays
args = [np.array(_m, copy=False, subok=subok) for _m in args]
File "/local/data1/miniconda3/lib/python3.8/site-packages/numpy/lib/stride_tricks.py", line 536, in <listcomp>
args = [np.array(_m, copy=False, subok=subok) for _m in args]
ValueError: setting an array element with a sequence.
(empty plot).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, so the point is you are definitelynot making it worse here, but the problem still remains.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that a quick bandaid for bar
and units is to use this approach there as well:
matplotlib/lib/matplotlib/axes/_axes.py
Lines 3357 to 3361 in f25c2d0
# Casting to object arrays preserves units. | |
if not isinstance(x, np.ndarray): | |
x = np.asarray(x, dtype=object) | |
if not isinstance(y, np.ndarray): | |
y = np.asarray(y, dtype=object) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that is what I'm fundamentally objecting to - just making an object array doesn't make any sense!
maxx = max(xmin.max(), xmax.max()) | ||
miny = y.min() | ||
maxy = y.max() | ||
minx = min(np.nanmin(xmin), np.nanmin(xmax)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since these are already combining/using masks above, would it make sense to add a masked_invalid()
call earlier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not really know how masks actually works (which is a bit ironically, considering my comment on your PR about masks I did just before seeing this...), although I realize that it may be related. Just noted that this didn't work correctly when trying out the other fixes.
I may give it a do though, either here or as a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 There is a cbook.safe_masked_invalid
if that would help at all. Masking the invalid locations just ignores those in the computations (well the details are more involved, but that is the easy way of thinking about it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've played around with it and cannot get it to work. The reason seems to be that the array has dtype=object
(although it is float when created) and therefore I get runtime warnings on min
and max
.
Using masked_invalid
gives a masked array with a fill_value of ?
, while cbook.safe_masked_invalid
just returns an identical array.
If the array was float
it would probably work though, but I have no idea where it is becoming dtype=object
. Maybe related to datetime
support?
(Not really clear to me why nanmin
and nanmax
is worse than adding masked conversions.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is necessarily anything wrong with it if we are expecting nan's for some reason. To me, this just seemed like it was potentially hiding something else, which as you've noted here this only happens for dtype=object
. If you are passing in a float array, I really don't think we should be making it an object array 😟 Perhaps this is because these error bars are generally used for categoricals so something is converting to objects fast because of that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do this because we don't want to drop units.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fine for the purpose of getting this to work.
Hello! A test in |
I think this is on us @pllim Looks like it is coming with this error: res = np.zeros_like(err, dtype=bool) # Default in case of nan
if np.any(np.less(err, -err, out=res, where=(err == err))): @oscargus, I wonder if the simple fix is to go to Although, I am wondering if this is going to hit other interesting cases too and we should back this commit out until we can solve the root of the issue with the units stuff. |
I think unit conversion has to happen earlier here. We are trying to not do it until draw time because these are Line2d objects, but that is too hard to thread through for compound types. |
Btw maybe let's open a new issue. We now have a Quantity class in testing that can be referenced for the error without recourse to astropy |
This seems to do it. One would think (at least I did) that Or rather, there seems to be an issue with either numpy or astropy here (but we can make sure it doesn't affect us, I'll provide a PR). Providing the
although if the argument is a I've opened a bug at numpy/numpy#21603 to avoid that more people fall in the same trap. (This reminds me that it would be nice to have a test suite with optional dependencies that runs every week or so. Although in this case we got the error back earlier than that.) @jklymak our |
Thank you for the quick response! Re: Quantity -- FWIW the full implementation is at https://github.com/astropy/astropy/blob/f0e2129aad460ea0db00730f10f98fe799aefb2f/astropy/units/quantity.py#L239 Re: Testing -- We test against a few dev versions of upstream in CI (for better or worse), so that is how we caught this quickly. We are also trying to figure out how to test against our own downstream packages as well. @astrofrog is putting something together at https://github.com/astropy/integration-testing (not deployed yet) if you are interested. |
Yes, thanks for testing our dev build. Does astropy test that changing axis units works? Or do you just need the first-pass conversion? |
Does this count as unit conversion? The test here plots one thing in one unit, and then a second thing in another compatible unit: |
@pllim Well that counts as first-pass conversion. i.e. both could be converted at the start of their method. But you never use |
I cannot find any |
I can see the confusion! |
PR Summary
Closes #22910
Also handles nan in vlines and hlines better. Earlier it "worked", but with a run-time warning. The bounding box was not scale correctly though.
Not sure if the way bounds are tested here is the best possible.
PR Checklist
Tests and Styling
pytest
passes).flake8-docstrings
and runflake8 --docstring-convention=all
).Documentation
doc/users/next_whats_new/
(follow instructions in README.rst there).doc/api/next_api_changes/
(follow instructions in README.rst there).