-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
_axes.py.scatter()
array index out of bound / calling from seaborn
#12641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The proposed fix looks fine to me. |
You may want to look at the selection code in #12572 which squashed a similar bug. If you open a PR, please do add atest so this doesn't happen again. Thanks! |
@mwaskom yes, new in This is matplotlib/lib/matplotlib/axes/_axes.py Lines 4184 to 4185 in 0881b9c
and in matplotlib/lib/matplotlib/axes/_axes.py Lines 4188 to 4193 in 921d4f4
@jklymak Will take a look there. Thx |
Even with the length check this is still problematic for certain iterables. I currently use the color option to color labeled data, where I label the data with a pandas Series indexed by ID numbers:
This plot allows for coloring the plotted points according to their label in v3.0.0. However, now when we check I have at lease one case of this that is breaking between 3.0.0 and 3.0.1. |
@montebhoover |
Yes that works for me - thanks! Does it work for both of our cases if the proposed fix is:
This allows me to still pass in a pandas Series:
but would that cover all the relevant cases? I believe a numpy ndarray is also not a Sequence, so that could cause problems if it contains strings. For my use I need to use the |
Pandas series are not iterable? What does no.isiterable give for them? |
In this case it’s because the index for c started at 10, so c[0] indexes into the series using 0 but 0 isn’t in the index. Therefore you see the key error. c.iloc[0] would work but that’s very pandas specific.
…Sent from my iPhone
On 27 Oct 2018, at 01:50, Jody Klymak ***@***.***> wrote:
Pandas series are not iterable? What does no.isiterable give for them?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I think the real issue here is Anyway, to make this work for lists, numpy arrays and pandas series, we need to avoid indexing into
@jklymak @anntzer What do you guys think? It might be a bit of an overkill but would work for python lists, numpy array or pandas series, or any iterables. |
Why not just next(iter(x)), which seems just as fine as going through islice? Of course there's also the issue of whether it would be problematic to advance a non-rewindable generator (not sure). |
+1 for next(iter(x)) if it works for all types. I only tried islice for all 3.
Where would this helper function sit if we had one?
…Sent from my iPad
On 27 Oct 2018, at 20:59, Antony Lee ***@***.***> wrote:
Why not just next(iter(x)), which seems just as fine as going through islice? Of course there's also the issue of whether it would be problematic to advance a non-rewindable generator (not sure).
Note that there are other places where we use x[0] for similar pruposes so they should get the same fix (hopefully factored out as a helper).
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
private (prefixed by underscore) in cbook. |
Ok, going to be traveling tomorrow but here's some stuff to get us going. I ended up writing a more general helper function to test element type in an Iterable. This function below is what I plan to add in The function basically tests if all elements are of the expected type by default, but can test only the first N items as well if specified. I ended up going with One thing on this - if the iterable being tested is empty, i.e. @jklymak Haven't looked into a test case yet but will give that some thoughts. Let me know what you guys think.
|
Sorry for my typo above but does np.is_iterable help? |
Less about testing for iterable here, it's caused by |
Would stick to just testing the first one: it seems needlessly costly to iterate a potentially very long collection in pure python code, and it's not as if we had any sensible behavior for mixed types anyways (I think). |
I agree, checking first 1 is what I plan to put into the PR fix.
How about the default behaviour for empty but not None list? Currently behaviour is to return True.
…Sent from my iPhone
On 28 Oct 2018, at 11:22, Antony Lee ***@***.***> wrote:
The function basically tests if all elements are of the expected type by default, but can test only the first N items as well if specified.
Would stick to just testing the first one: it seems needlessly costly to iterate a potentially very long collection in pure python code, and it's not as if we had any sensible behavior for mixed types anyways (I think).
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
While the general version of this sounds good, I am skeptical of adding a new helper function in a patch release. Wouldn't the simplest solution be to use https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/cbook/__init__.py#L1606 instead of |
I wrote test cases for the general method proposed as well. But happy to do either. I noticed that there isn't any dependency on I know that |
We have a |
Interesting. Where can I find an example?
Thx
…Sent from my iPhone
On 29 Oct 2018, at 03:12, Tim Hoffmann ***@***.***> wrote:
We have a pd fixture to enable pandas-related tests but not introduce it as mandatory dependency.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
E.g. matplotlib/lib/matplotlib/tests/test_axes.py Line 5314 in 6af7450
|
see #12664. IMHO, this is an easy way to hit the same bug. import matplotlib.pyplot as plt
import numpy as np
n_points = 4
xs = ys = zs = np.full(n_points, np.nan)
colors = list(np.full(2, 'k'))
plt.scatter(xs, ys, zs, c=colors)
plt.show() |
@massich I don't quite understand that example. That errors on colors not being the same length as xs, etc. Which seems correct to me. |
This is the right snipped import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits.mplot3d import Axes3D # noqa: F401 unused import
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
n_points = 4
xs = ys = zs = np.full(n_points, np.nan)
colors = list(np.full(n_points, 'k'))
ax.scatter(xs, ys, zs, c=colors)
plt.show() |
Pushed a fix using @tacaswell 's suggestion to use @tacaswell The test case for @montebhoover @massich this should fix both of your issues. Let me know what you guys think. |
Closed by #12673, afaict. |
Bug report
Bug summary
Seeing this with
matplotlib 3.0.1
andseaborn 0.9
. cc @mwaskommatplotlib.axes._axes.scatter() tests parameter
c
hitsarray index out of bound
whenc
is an empty array.I encountered this while using
seaborn.swarmplot()
withhue=
parameter. Essentially this plots each group (hue
, in addition to x-axis category) and some times a group can be empty for a major x-axis category.Suggest potential fix to test length of
c
as well before accessingc[0]
.Code for reproduction
Actual outcome
This is the error I see:
Happy to create a PR if we have happy with the proposed length test.
The text was updated successfully, but these errors were encountered: