Skip to content

plt.scatter, error with NaN values and edge color #19066

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
donok1 opened this issue Dec 4, 2020 · 4 comments · Fixed by #19539
Closed

plt.scatter, error with NaN values and edge color #19066

donok1 opened this issue Dec 4, 2020 · 4 comments · Fixed by #19539

Comments

@donok1
Copy link

donok1 commented Dec 4, 2020

Bug report

Bug summary
When plotting a scatter plot with y containing NaN values, depending on the number of values, an error comes up when setting the color or edgecolor.
This error was introduced with version 3.3.0, might be linked to #17849

Code for reproduction
Here is a basic example

import matplotlib.pyplot as plt
import numpy as np
plt.scatter([1, 2, 3], [ 0, np.nan, 2],color=(1, 0, 0))

Actual outcome

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-77-cf954735eeaf> in <module>
      1 import matplotlib.pyplot as plt
      2 import numpy as np
----> 3 plt.scatter([1, 2, 3], [ 0, np.nan, 2],color=(1, 0, 0))

/usr/local/lib/python3.9/site-packages/matplotlib/pyplot.py in scatter(x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, edgecolors, plotnonfinite, data, **kwargs)
   2888         verts=cbook.deprecation._deprecated_parameter,
   2889         edgecolors=None, *, plotnonfinite=False, data=None, **kwargs):
-> 2890     __ret = gca().scatter(
   2891         x, y, s=s, c=c, marker=marker, cmap=cmap, norm=norm,
   2892         vmin=vmin, vmax=vmax, alpha=alpha, linewidths=linewidths,

/usr/local/lib/python3.9/site-packages/matplotlib/__init__.py in inner(ax, data, *args, **kwargs)
   1445     def inner(ax, *args, data=None, **kwargs):
   1446         if data is None:
-> 1447             return func(ax, *map(sanitize_sequence, args), **kwargs)
   1448 
   1449         bound = new_sig.bind(ax, *args, **kwargs)

/usr/local/lib/python3.9/site-packages/matplotlib/cbook/deprecation.py in wrapper(*inner_args, **inner_kwargs)
    409                          else deprecation_addendum,
    410                 **kwargs)
--> 411         return func(*inner_args, **inner_kwargs)
    412 
    413     return wrapper

/usr/local/lib/python3.9/site-packages/matplotlib/axes/_axes.py in scatter(self, x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, edgecolors, plotnonfinite, **kwargs)
   4486         offsets = np.ma.column_stack([x, y])
   4487 
-> 4488         collection = mcoll.PathCollection(
   4489                 (path,), scales,
   4490                 facecolors=colors,

/usr/local/lib/python3.9/site-packages/matplotlib/collections.py in __init__(self, paths, sizes, **kwargs)
    951         """
    952 
--> 953         super().__init__(**kwargs)
    954         self.set_paths(paths)
    955         self.set_sizes(sizes)

/usr/local/lib/python3.9/site-packages/matplotlib/cbook/deprecation.py in wrapper(*inner_args, **inner_kwargs)
    409                          else deprecation_addendum,
    410                 **kwargs)
--> 411         return func(*inner_args, **inner_kwargs)
    412 
    413     return wrapper

/usr/local/lib/python3.9/site-packages/matplotlib/collections.py in __init__(self, edgecolors, facecolors, linewidths, linestyles, capstyle, joinstyle, antialiaseds, offsets, transOffset, norm, cmap, pickradius, hatch, urls, offset_position, zorder, **kwargs)
    173         self._hatch_color = mcolors.to_rgba(mpl.rcParams['hatch.color'])
    174         self.set_facecolor(facecolors)
--> 175         self.set_edgecolor(edgecolors)
    176         self.set_linewidth(linewidths)
    177         self.set_linestyle(linestyles)

/usr/local/lib/python3.9/site-packages/matplotlib/collections.py in set_edgecolor(self, c)
    828         """
    829         self._original_edgecolor = c
--> 830         self._set_edgecolor(c)
    831 
    832     def set_alpha(self, alpha):

/usr/local/lib/python3.9/site-packages/matplotlib/collections.py in _set_edgecolor(self, c)
    812         except AttributeError:
    813             pass
--> 814         self._edgecolors = mcolors.to_rgba_array(c, self._alpha)
    815         if set_hatch_color and len(self._edgecolors):
    816             self._hatch_color = tuple(self._edgecolors[0])

/usr/local/lib/python3.9/site-packages/matplotlib/colors.py in to_rgba_array(c, alpha)
    339         return np.zeros((0, 4), float)
    340     else:
--> 341         return np.array([to_rgba(cc, alpha) for cc in c])
    342 
    343 

/usr/local/lib/python3.9/site-packages/matplotlib/colors.py in <listcomp>(.0)
    339         return np.zeros((0, 4), float)
    340     else:
--> 341         return np.array([to_rgba(cc, alpha) for cc in c])
    342 
    343 

/usr/local/lib/python3.9/site-packages/matplotlib/colors.py in to_rgba(c, alpha)
    187         rgba = None
    188     if rgba is None:  # Suppress exception chaining of cache lookup failure.
--> 189         rgba = _to_rgba_no_colorcycle(c, alpha)
    190         try:
    191             _colors_full_map.cache[c, alpha] = rgba

/usr/local/lib/python3.9/site-packages/matplotlib/colors.py in _to_rgba_no_colorcycle(c, alpha)
    261     # tuple color.
    262     if not np.iterable(c):
--> 263         raise ValueError(f"Invalid RGBA argument: {orig_c!r}")
    264     if len(c) not in [3, 4]:
    265         raise ValueError("RGBA sequence should have length 3 or 4")

ValueError: Invalid RGBA argument: 1

First analys
It is only affects the edgecolor option, facecolor works fine:

# This work fine
plt.scatter([1, 2, 3], [ 0, np.nan, 2])
plt.scatter([1, 2, 3], [ 0, np.nan, 2], facecolor=(1, 0, 0))

It also seem to be only happening when there are 3 values to plot, 1 or 2 of them being NaNs:

# These all work
plt.scatter([1, 2], [1, np.nan ],color=(1, 0, 0))
plt.scatter([1, 2, 3, 4], [1, 2, np.nan, 4 ],color=(1, 0, 0))
plt.scatter([1, 2, 3, 4, 5], [1, 2, np.nan, 4, 5 ],color=(1, 0, 0))
plt.scatter([1, 2, 3], [np.nan , np.nan , np.nan ],color=(1, 0, 0))
plt.scatter(np.arange(1000), np.append(np.random.rand(999),np.NaN),color=(1, 0, 0))

# But those not
plt.scatter([1, 2, 3], [1, np.nan, np.nan ],color=(1, 0, 0))
plt.scatter([1, 2, 3], [ np.nan, 2, np.nan ],color=(1, 0, 0))

Error was discovered through seaborn, mwaskom/seaborn#2373

Matplotlib version

  • Operating system: Mac OS 10.15 / Windows 10
  • Matplotlib version: >= 3.3.0
  • Matplotlib backend (print(matplotlib.get_backend())): module://ipykernel.pylab.backend_inline
  • Python version: 3.8, 3.9.0
  • Jupyter version (if applicable): 2.2.9
  • Other libraries:
@QuLogic
Copy link
Member

QuLogic commented Dec 4, 2020

A bit unexpectedly, bisects to 1cbf8ff, cc @anntzer.

@QuLogic QuLogic added this to the v3.3.4 milestone Dec 4, 2020
@donok1 donok1 changed the title put.scatter, error with NaN values and edge color plt.scatter, error with NaN values and edge color Dec 5, 2020
@anntzer
Copy link
Contributor

anntzer commented Dec 5, 2020

From a quick look this really arises from the scatter color parsing machinery which passes an RGB color with an incorrectly masked component to _to_rgba_no_colorcycle, and expects the mask to just be ignored. I'd argue the scatter machinery should not do that (and _to_rgba_no_colorcycle should indeed reject such inputs), but in the meantime I looks like this can be fixed with

diff --git i/lib/matplotlib/colors.py w/lib/matplotlib/colors.py
index 9aad5b9b0..4c51c342e 100644
--- i/lib/matplotlib/colors.py
+++ w/lib/matplotlib/colors.py
@@ -275,7 +275,7 @@ def _to_rgba_no_colorcycle(c, alpha=None):
         raise ValueError(f"Invalid RGBA argument: {orig_c!r}")
     if len(c) not in [3, 4]:
         raise ValueError("RGBA sequence should have length 3 or 4")
-    if not all(isinstance(x, Number) for x in c):
+    if not all(isinstance(x, (Number, np.ma.core.MaskedConstant)) for x in c):
         # Checks that don't work: `map(float, ...)`, `np.array(..., float)` and
         # `np.array(...).astype(float)` would all convert "0.5" to 0.5.
         raise ValueError(f"Invalid RGBA argument: {orig_c!r}")

as well as silencing the warning that comes below from converting as masked value to a float (which likely throws away all the speedup...). Or #15834 can be cleanly reverted.

Still that's only a stopgap; I'd say #15834 really just exposed an issue that was actually introduced in #12422: before #12422, scatter([1, 2, 3], [0, 1, 2], edgecolors=[1, np.nan, np.nan]) used to raise a ValueError (which seems correct); after it, it now treats the nans as 0 (so as if edgecolors=[1, 0, 0]) -- but masking out the 2nd and 3rd points (so as if the nans in the G and B channels were masking them).

@QuLogic
Copy link
Member

QuLogic commented Dec 22, 2020

I suppose we'd have to confirm with @efiring on #12422.

@efiring
Copy link
Member

efiring commented Feb 18, 2021

The problem is that in scatter itself, the case where edgecolor is rgb or rgba is not being detected;
in that case it should never be getting included in the arguments to _combine_masks.
I will try to fix it shortly.

efiring added a commit to efiring/matplotlib that referenced this issue Feb 18, 2021
MihaiAnton pushed a commit to MihaiAnton/matplotlib that referenced this issue Mar 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants