Skip to content

Matplotlib "eats" points when zeros present on logscaled scatter plot #2872

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
andrewcollette opened this issue Mar 6, 2014 · 19 comments
Closed

Comments

@andrewcollette
Copy link

The following script demonstrates the issue with Matplotlib 1.3.1:

import pylab as plt

X = 5  # set X = 0 and matplotlib drops all later points

data1 = [1,2,3,4,5,6,7,8]
data2 = [1,2,3,4,X,6,7,8]

f = plt.figure(0)
plt.scatter(data1, data2)
plt.xscale('log')
plt.yscale('log')
plt.show()

The problem occurs when using a scatter plot with logscaled axes; if zeros are present in the data, a substantial fraction of the points are discarded. Obviously zeros can't be displayed on a log plot, but I would have expected those points (and only those points) to be silently dropped, or at worst an exception be raised.

@Tillsten
Copy link
Contributor

Tillsten commented Mar 6, 2014

Not happing here, also 1.3.1.

@andrewcollette
Copy link
Author

If it helps, this is on OS X (10.9.2); matplotlib installed via pip.

@tacaswell
Copy link
Member

What backend are you using?

I can not reproduce this on master (linux, pyqt)

@tacaswell tacaswell added the OSX label Mar 6, 2014
@andrewcollette
Copy link
Author

The backend is "MacOSX", Python 2.7.6.

@tacaswell
Copy link
Member

@mdehoon, @cimarronm, @efiring Can one of you take a look at this? I suspect this is a macosx backend bug. I do not have easy access to a mac for testing this.

@tacaswell tacaswell modified the milestones: v1.3.x, v1.3.x blocker Mar 6, 2014
@efiring
Copy link
Member

efiring commented Mar 6, 2014

I can't reproduce this with macosx backend, master, python 2.7.3, numpy 1.7.0.

@andrewcollette
Copy link
Author

Some more information:

NumPy 1.8.0

WXAgg and TKAgg work correctly (the troublesome point is simply skipped), as does "Agg" and using savefig() to output a PDF file.

@jenshnielsen
Copy link
Member

I can reproduce it with the MacOS backend on 1.3.1. Works as expected using the QT backend.

@efiring
Copy link
Member

efiring commented Mar 6, 2014

@jenshnielsen what is your numpy version?

@cimarronm
Copy link
Contributor

I cannot reproduce with MacOS backend on master, numpy 1.8.0 and python 3.3.4

@jenshnielsen
Copy link
Member

Numpy 1.8.1rc1 and python2.7

@efiring
Copy link
Member

efiring commented Mar 6, 2014

OK it is not numpy. I also tried 1.8.1rc1 and python 2.7, and with mpl master I can't reproduce it. I hope this means that whatever it is, it is already fixed in master.

@andrewcollette
Copy link
Author

I did some further testing: on my system this bug IS present in the v1.3.x branch (8863ac), and IS NOT present in master (93de06).

@Tillsten
Copy link
Contributor

Tillsten commented Mar 7, 2014

Maybe fixed with #1886?

@efiring
Copy link
Member

efiring commented Mar 7, 2014

Given that this is fixed in master, that a release from master is in the works, and that it is not clear whether there will be another 1.3 release, I am closing this.

@efiring efiring closed this as completed Mar 7, 2014
@mdehoon
Copy link
Contributor

mdehoon commented Mar 7, 2014

The bug is being caused by the argument "offsets" in the call to gc.draw_path_collection. This argument contains NaN's if the log(0) data point is included. If we can assume that gc.draw_path_collection and other functions always receive valid arguments (i.e., not containing NaN's), then we can consider this bug fixed. However, if we cannot make that assumption, then gc.draw_path_collection contains a bug. It's fairly easy to fix gc.draw_path_collection such that NaN's are avoided (should they occur in the input arguments), but it may affect the drawing speed performance, so I'd be happier if we can assume that there won't be NaN's in the input arguments to drawing functions.

@efiring
Copy link
Member

efiring commented Mar 7, 2014

From Collection._prepare_points() I conclude that a NaN is the intended marker for a missing value in offsets. Now, the question is why this causes a problem in 1.3.1 but not in master.

@mdehoon
Copy link
Contributor

mdehoon commented Mar 7, 2014

Two reasons:

  1. In master, draw_markers is used instead of draw_path_collection. The draw_markers function treats NaN's correctly, should they arise.
  2. In draw_markers, the X=0 causes the dot to get a negative coordinate (and therefore is not displayed), instead of a NaN.
    I'll prepare a patch to fix draw_path_collection. It would be good if somebody has a look at why a negative coordinate appears there.

mdehoon pushed a commit to mdehoon/matplotlib that referenced this issue Mar 7, 2014
… when using a logarithmic scale), draw_path_collection may get offsets containing NaN's. In that case, using CGContextTranslateCTM once with translation and once with -translation will not restore the original CTM. This bugfix adds a check for NaN/inf.
@tacaswell
Copy link
Member

Some what recently there was a change in how clipping for log plots worked, that might be related.

tacaswell added a commit that referenced this issue Mar 11, 2014
Fix for issue #2872. Skip NaN's in draw_path_collection.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants