Skip to content

Plotting three points using scatter results in wrong color - 1.4.3 #5377

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
liujimj opened this issue Nov 2, 2015 · 16 comments
Closed

Plotting three points using scatter results in wrong color - 1.4.3 #5377

liujimj opened this issue Nov 2, 2015 · 16 comments
Milestone

Comments

@liujimj
Copy link

liujimj commented Nov 2, 2015

I'm running this on Windows 10 / Anaconda Python 3.4.3 / jupyter iPython 4

Here's some repro code:

import matplotlib.pyplot as plt
import numpy as np

fig = plt.figure(1)
ax = plt.axes([0., 0., 1., 1.])

points = np.random.rand(2,2) # plot 2 points, totally fine
ax.scatter(points[0], points[1], c=[1, 0, 0],  s=200)
points = np.random.rand(2,3) # plot 3 points, hopelessly broken
ax.scatter(points[0], points[1], c=[0, 1, 0], s=200)
points = np.random.rand(2,3) # plot 3 points with string color, totally ok
ax.scatter(points[0], points[1], c='g', s=200)
points = np.random.rand(2,4) # plot 4 points, totally fine
ax.scatter(points[0], points[1], c=[0, 0, 1], s=200)
plt.show()

Here's what I get:

download 1

As you can see, plotting 2, 4, or any other number of points using ax.scatter is fine, but graphing 3 points while using RGB color array results in two points as black, and one point as white, instead of using the specified color.

Here's an example on a real dataset. As you can see, it also messes with the legend color. It's hard to diagnose the issue because it's completely non-obvious that the number of points plotted is what triggers the bug.
download 2

I was pulling my hair out for awhile until I realized that the problem only manifested itself on series that had 3 members.

@efiring efiring added this to the Next bugfix release (1.5.1) milestone Nov 2, 2015
@efiring
Copy link
Member

efiring commented Nov 2, 2015

Confirmed on master.
As a workaround, if you use c=[[0, 1, 0]] it will work.
Without the workaround, I get 2 black and 1 magenta rather than white; but in any case it is a bug.

@liujimj
Copy link
Author

liujimj commented Nov 2, 2015

Thanks for taking a look.

I tried the workaround but I don't see any change. With nested color arrays:

download 3

Here's my attempt:

import matplotlib.pyplot as plt

fig = plt.figure(1)
ax = plt.axes([0., 0., 1., 1.])

points = np.random.rand(2,2) # plot 2 points, OK
ax.scatter(points[0], points[1], c=[[1, 0, 0]],  s=200)
points = np.random.rand(2,3) # plot 3 points, broken
ax.scatter(points[0], points[1], c=[[0, 1, 0]], s=200)
points = np.random.rand(2,3) # plot 3 points with string color, OK
ax.scatter(points[0], points[1], c='g', s=200)
points = np.random.rand(2,4) # plot 4 points, OK
ax.scatter(points[0], points[1], c=[[0, 0, 1]], s=200)
plt.show()

@u55
Copy link
Contributor

u55 commented Nov 2, 2015

You have it backwards. In your examples, the scatter plots with three points are correct, the other plots are wrong. The documentation for scatter says:

c can be a single color format string, or a sequence of color specifications of length N, or a sequence of N numbers to be mapped to colors using the cmap and norm specified via kwargs (see below). Note that c should not be a single numeric RGB or RGBA sequence because that is indistinguishable from an array of values to be colormapped. c can be a 2-D array in which the rows are RGB or RGBA, however.

(emphasis mine)

The problem, in my opinion, is that scatter should raise a ValueError if c is a sequence and does not equal the number of points, instead of trying to interpret sequences of length=3 as RGB values, and sequences of length=4 as RGBA values, as it currently does.

Note that the results depend on the current default colormap, as expected:

import matplotlib.pyplot as plt
import numpy as np

fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2,2)

plt.rcParams['image.cmap'] = 'jet'
points = np.random.rand(2,3)
ax1.scatter(points[0], points[1], c=[0, 1, 0.3], s=200)

plt.rcParams['image.cmap'] = 'gray'
points = np.random.rand(2,3)
ax2.scatter(points[0], points[1], c=[0, 1, 0.3], s=200)

plt.rcParams['image.cmap'] = 'cool'
points = np.random.rand(2,3)
ax3.scatter(points[0], points[1], c=[0, 1, 0.3], s=200)

plt.rcParams['image.cmap'] = 'BrBG'
points = np.random.rand(2,3)
ax4.scatter(points[0], points[1], c=[0, 1, 0.3], s=200)

plt.show()

scatter_bug

@liujimj
Copy link
Author

liujimj commented Nov 3, 2015

TIL that the c argument expects a vector of colors.

Shouldn't nesting work though? From my last reply, the nesting is being completely ignored, even though the documentation says otherwise, e.g. ax.scatter(points[0], points[1], c=[[0, 1, 0]], s=200) is interpreted as if it were c=[0, 1, 0]

From the docs:

c can be a 2-D array in which the rows are RGB or RGBA, however, including the case of a single row to specify the same color for all points.

My workaround is to duplicate the rows, e.g.
c = [[0, 1, 0],] * n_points

@u55
Copy link
Contributor

u55 commented Nov 4, 2015

I agree that c=[[0, 1, 0]] should work as a single RGB color, but matplotlib 1.5.0 incorrectly treats it the same as c=[0, 1, 0]. This is definitely a bug.

Another possible workaround is to use matplotlib.colors.rgb2hex. Example usage:

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import rgb2hex
plt.rcParams['image.cmap'] = 'gray'

fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2,2)
points = np.random.rand(2,3)

ax1.scatter(points[0], points[1], s=200, c=[[0, 1, 0]]) # Wrong!
ax2.scatter(points[0], points[1], s=200, c=(points.shape[1]*[[0, 1, 0]])) # Works Correctly
ax3.scatter(points[0], points[1], s=200, c=np.repeat([[0, 1, 0]], points.shape[1], axis=0)) # Works Correctly
ax4.scatter(points[0], points[1], s=200, c=rgb2hex([0, 1, 0])) # Works Correctly

plt.show()

@liujimj
Copy link
Author

liujimj commented Nov 7, 2015

I confirm that both workarounds get the intended results. Thank you both for your help.

@mdboom mdboom modified the milestones: next bug fix release (2.0.1), Critical bugfix release (1.5.1) Nov 8, 2015
@has2k1
Copy link
Contributor

has2k1 commented Nov 14, 2015

Same issue with RGBA tuple. This time the bug shows up when plotting 4 points.

import matplotlib.pyplot as plt

x = [[1],
     [2, 2],
     [3, 3, 3],
     [4, 4, 4, 4],
     [5, 5, 5, 5, 5],
     [6, 6, 6, 6],
     [7, 7, 7],
     [8, 8],
     [9]]
y = [list(range(len(_x))) for _x in x]
facecolor = (1.0, 1.0, 1.0, .5)

fig, ax = plt.subplots(1)
for i in range(len(x)):
    ax.scatter(x=x[i],
               y=y[i],
               facecolor=facecolor,
               s=15**2)

plt.show()

scatter-colortup-glitch

@wavexx
Copy link
Contributor

wavexx commented May 16, 2016

I got bitten by this behavior several times now. Seeing #6087 being closed woke me up.
Can't we restrict color specs to be always tuples and remove the ambiguity?

I understand that underneath we convert to ndarray, but the public interface seems pretty unambiguous to me.

@tacaswell
Copy link
Member

The last example now works correctly on 2.x (more-or-less, the default edge width and color changed to 0 and 'face')

import matplotlib.pyplot as plt

x = [[1],
     [2, 2],
     [3, 3, 3],
     [4, 4, 4, 4],
     [5, 5, 5, 5, 5],
     [6, 6, 6, 6],
     [7, 7, 7],
     [8, 8],
     [9]]
y = [list(range(len(_x))) for _x in x]
facecolor = (1, 1, 1, .5)

fig, ax = plt.subplots(1)
for i in range(len(x)):
    ax.scatter(x=x[i],
               y=y[i],
               facecolor=facecolor,
               s=15**2,
               linewidths=1,
               edgecolors='k')

plt.show()

so

The fundemental problem here is that sometimes c is meant as a 'color spec', that is an RGB on RGBA tuple, or a string (either a color name or a float string which maps to shades of gray) OR an array of the same size of x and y which should be color mapped. In the case of length 3 and 4 there is no way to break user intention so we go with color mapping. In the case of the kwarg color it is always a color.

@efiring
Copy link
Member

efiring commented May 17, 2016

On 2016/05/16 10:34 AM, wavexx wrote:

Can't we restrict color specs to be always tuples and remove the ambiguity?

No, because that would eliminate the use of a 2-D ndarray, (n,3) or
(n,4), for a sequence of colorspecs. Or a row indexed from such an array.

With the most recent change, if you want a single color for everything,
use the "color" kwarg.

The design error here, from early days (and probably from copying
Matlab), is trying to use a single kwarg for too many different things.

@wavexx
Copy link
Contributor

wavexx commented May 24, 2016

The answer from @efiring settles my confusion about allowed types. I didn't realize 2d ndarrays were allowed as well.

My only fear is the subtle difference in c/color semantics used in existing functions (where c/color are aliases).

@QuLogic QuLogic modified the milestones: 2.0.1 (next bug fix release), 2.0.2 (next bug fix release) May 3, 2017
@tacaswell
Copy link
Member

Closing as given the extra degrees of freedom in c we are breaking the ambiguity in a reasonable way and if the user wants to set the (rather than go through the color mapping framework via 'c') they can use color or facecolor/edgecolor.

@anntzer
Copy link
Contributor

anntzer commented Oct 21, 2017

I think the subtle difference between c and color could at least be better documented (e.g. in the docstring entry for c?). Feel free to reclose if you disagree.

@anntzer anntzer reopened this Oct 21, 2017
@tacaswell tacaswell modified the milestones: v2.1.1, v2.2 Oct 21, 2017
@tacaswell
Copy link
Member

Fair enough.

Re-milestoned for 2.2, I am trying to use the 2.1.1 milestone to manage the 2.1.1 release!

@anntzer
Copy link
Contributor

anntzer commented Oct 21, 2017

that is totally fine with me :-)

@QuLogic
Copy link
Member

QuLogic commented Jul 21, 2020

I think this is sufficiently documented now, from e.g., #17499.

@QuLogic QuLogic closed this as completed Jul 21, 2020
@QuLogic QuLogic modified the milestones: needs sorting, v3.3.0 Jul 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants