Skip to content

Scatter plot with non-sequence ´c´ color should give a better Error message. #10365

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jason-neal opened this issue Feb 1, 2018 · 10 comments
Closed

Comments

@jason-neal
Copy link
Contributor

Bug report

Bug summary

Using a scatter plot with a single point at. But using a single number for the "c" keyword gave an slightly unhelpful error about a TypeError from mcolors.to_rgba_array. A cmap is used to map the number given but this is omitted here as the Error is still reproduced.

Code for reproduction

import maptlotlib.pyplot as plt
plt.scatter([0], [1], c=0.5)

Actual outcome

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-16-a8c544fd7de9> in <module>()
----> 1 plt.scatter([0], [1], c=0.5)

~/anaconda3/lib/python3.6/site-packages/matplotlib/pyplot.py in scatter(x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, edgecolors, hold, data, **kwargs)
   3376                          vmin=vmin, vmax=vmax, alpha=alpha,
   3377                          linewidths=linewidths, verts=verts,
-> 3378                          edgecolors=edgecolors, data=data, **kwargs)
   3379     finally:
   3380         ax._hold = washold

~/anaconda3/lib/python3.6/site-packages/matplotlib/__init__.py in inner(ax, *args, **kwargs)
   1715                     warnings.warn(msg % (label_namer, func.__name__),
   1716                                   RuntimeWarning, stacklevel=2)
-> 1717             return func(ax, *args, **kwargs)
   1718         pre_doc = inner.__doc__
   1719         if pre_doc is None:

~/anaconda3/lib/python3.6/site-packages/matplotlib/axes/_axes.py in scatter(self, x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, verts, edgecolors, **kwargs)
   3981             try:
   3982                 # must be acceptable as PathCollection facecolors
-> 3983                 colors = mcolors.to_rgba_array(c)
   3984             except ValueError:
   3985                 # c not acceptable as PathCollection facecolor

~/anaconda3/lib/python3.6/site-packages/matplotlib/colors.py in to_rgba_array(c, alpha)
    229         pass
    230     # Convert one at a time.
--> 231     result = np.empty((len(c), 4), float)
    232     for i, cc in enumerate(c):
    233         result[i] = to_rgba(cc, alpha)

TypeError: object of type 'int' has no len()

Expected outcome

Have a useful message from an except TypeError: from the failing colors = mcolors.to_rgba_array(c) call in line 3983 much like the ValueError except, would probably do it.

Helpfully indicating that c must be the correct type (color, sequence, or sequence of color).

Note:
With x, y, and c all as single numbers the scatter plot call runs without any errors raised,
e.g. plt.scatter(0, 1, c=0.5)

@afvincent
Copy link
Contributor

@jason-neal thank you for the report.

As far as I understand, the difference between case A plt.scatter(0, 1, c=0.5) and case B plt.scatter([0], [1], c=0.5) is triggered in the following block of Axes.scatter (starting here):

        # After this block, c_array will be None unless
        # c is an array for mapping.  The potential ambiguity
        # with a sequence of 3 or 4 numbers is resolved in
        # favor of mapping, not rgb or rgba.
        if c_none or co is not None:
            c_array = None
        else:  # <- from the not shown instructions above, c_none is False
            try:
                c_array = np.asanyarray(c, dtype=float)
                if c_array.shape in xy_shape:  # <- True for case *A* => everything will be fine
                    c = np.ma.ravel(c_array)
                else:  # <- in case *B* c_array.shape == (), while xy_shape == ((1,), (1,))
                    # Wrong size; it must not be intended for mapping.
                    c_array = None
            except ValueError:
                # Failed to make a floating-point array; c must be color specs.
                c_array = None

        if c_array is None:  # <- True in case *B*
            try:
                # must be acceptable as PathCollection facecolors
                colors = mcolors.to_rgba_array(c)  # <- BOOM!
            except ValueError:
                # c not acceptable as PathCollection facecolor
                msg = ("c of shape {0} not acceptable as a color sequence "
                       "for x with size {1}, y with size {2}")
                raise ValueError(msg.format(c.shape, x.size, y.size))
        else:
            colors = None  # use cmap, norm after collection is created

@afvincent afvincent self-assigned this Feb 2, 2018
@afvincent
Copy link
Contributor

afvincent commented Feb 2, 2018

Funnily, something like plt.scatter([0], 1, c=0.5) does not raise an exception. @tacaswell (or any other core dev actually using scatter), is it the expected behavior? It seems a bit unconsistent to me to raise an error for plt.scatter([0], [1], c=0.5) but not for plt.scatter([0], 1, c=0.5) or plt.scatter(0, 1, c=0.5), but I would glad to have a second opinion.

(Too bad I had a possible fix before noticing that behavior 😢)
For the record, my possible fix was something along the lines of @jason-neal 's suggestion:

---
 lib/matplotlib/axes/_axes.py | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/lib/matplotlib/axes/_axes.py b/lib/matplotlib/axes/_axes.py
index df2ccd55e..2137f6805 100644
--- a/lib/matplotlib/axes/_axes.py
+++ b/lib/matplotlib/axes/_axes.py
@@ -4228,11 +4228,19 @@ linewidth=2, markersize=12)
             try:
                 # must be acceptable as PathCollection facecolors
                 colors = mcolors.to_rgba_array(c)
-            except ValueError:
+            except (ValueError, TypeError) as e:
                 # c not acceptable as PathCollection facecolor
-                raise ValueError("c of shape {} not acceptable as a color "
-                                 "sequence for x with size {}, y with size {}"
-                                 .format(c.shape, x.size, y.size))
+                try:
+                    msg = ("c of shape {} not acceptable as a color "
+                           "sequence for x with size {}, y with size {}"
+                           .format(c.shape, x.size, y.size))
+                except AttributeError:
+                    # More verbose version that avoids `.shape`. Avoid `.size`
+                    # as well to handle cases as `plt.scatter(0, 1, c=[0.5])`.
+                    msg = ("c = {} is not of acceptable shape for color "
+                           "mapping with x with shape {}, y with shape {}"
+                           .format(c, *xy_shape))
+                raise type(e)(msg)
         else:
             colors = None  # use cmap, norm after collection is created
 
-- 
2.14.3

@afvincent
Copy link
Contributor

Reading more carefully the docstring of plt.scatter:

x, y : array_like, shape (n, )
    Input data
[...]
c : color, sequence, or sequence of color, optional, default: 'b'
    `c` can be a single color format string, or a sequence of color
    specifications of length `N`, or a sequence of `N` numbers to be
    mapped to colors using the `cmap` and `norm` specified via kwargs
    (see below). Note that `c` should not be a single numeric RGB or
    RGBA sequence because that is indistinguishable from an array of
    values to be colormapped.  `c` can be a 2-D array in which the
    rows are RGB or RGBA, however, including the case of a single
    row to specify the same color for all points.

my understanding is that we are not supporting any of the following cases, among which some are actually not raising an error:

  • plt.scatter(0, 1, c=0.5) -> no exception
  • plt.scatter([0], [1], c=0.5)
  • plt.scatter(0, 1, c=[0.5])
  • plt.scatter([0], 1, c=0.5) -> no exception
  • plt.scatter([0], 1, c=[0.5]) -> no exception
  • etc.

although for each case at least one of the parameter is not an “array-like”, contrary to what is asked in the docstring.

PS: one can also note the inconsistent case between n and N in the docstring above \o/.

@jason-neal
Copy link
Contributor Author

With the recent updates to scatter e.g. #11383 the examples given above now produce error messages.

  • plt.scatter(0, 1, c=0.5) -> IndexError: tuple index out of range
  • plt.scatter([0], [1], c=0.5) -> IndexError: tuple index out of range
  • plt.scatter(0, 1, c=[0.5]) -> ValueError: 'c' argument has 1 elements, which is not acceptable for use with 'x' with size 1, 'y' with size 1.
  • plt.scatter([0], 1, c=0.5) -> IndexError: tuple index out of range

The last example still works.

  • plt.scatter([0], 1, c=[0.5]) -> no exception / Works

If this is supposed to work like this then this issue can probably be closed.

@mathematicalmichael
Copy link

mathematicalmichael commented Feb 1, 2019

leaving this comment here in hopes it may be useful to others who come across it:

A solution that works for passing both arrays and scalars is:

plt.scatter(np.reshape(x,-1), np.reshape(y,-1), c=np.reshape(c,-1))

I was updating code that had x and y formed by indexing arrays, but if you passed a length-1 index, it would return a scalar, not a length-1 array. This would result in the error

"ValueError: 'c' argument has 1 elements, which is not acceptable for use with 'x' with size 1, 'y' with size 1."

and I found the np.reshape approach to be the cleanest way to upgrade from matplotlib==2.2.3 and handle a variety of use-cases.

@ImportanceOfBeingErnest
Copy link
Member

We definitely need more tests to make sure not to break the scatter API twice within two successive versions again.

@mathematicalmichael
Copy link

mathematicalmichael commented Apr 3, 2019 via email

@efiring
Copy link
Member

efiring commented Apr 14, 2019

Closing; mostly superseded by #12735.

@sivaji12344
Copy link

@jason-neal for the code below
#Pass data to a matrix
x = dataFrame.as_matrix(columns = ['Size','# of Bedrooms'])
y = dataFrame.as_matrix(columns = ['Price'])

#Plot data, the color represents the
sc = scatter(x[:,0],x[:,1], c = y, s= 50)
cb = colorbar(sc)
cb.ax.set_ylabel('House Price $')
xlabel('Size m^2')
ylabel('# of Bedrooms')

#Make global variables that will be use often
global m,n
m = x.shape[1]
n = x.shape[1]+1

#make initial theta as an array of 1D
inTheta = zeros(n)

#it gives
ValueError: 'c' argument has 47 elements, which is not acceptable for use with 'x' with size 47, 'y' with size 47.
please help me with this thank you

@jklymak
Copy link
Member

jklymak commented Jan 22, 2020

It will be fixed 3.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants