Skip to content

Prefer plot() to scatter() in examples. #14174

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

anntzer
Copy link
Contributor

@anntzer anntzer commented May 8, 2019

... when markers don't vary in size or shape.

The docstring of scatter() suggests to do so (for performance), so let's
showcase it.

PR Summary

PR Checklist

  • Has Pytest style unit tests
  • Code is Flake 8 compliant
  • New features are documented, with examples if plot related
  • Documentation is sphinx and numpydoc compliant
  • Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
  • Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

@@ -35,7 +35,7 @@
pivot='mid', units='inches')
qk = ax2.quiverkey(Q, 0.9, 0.9, 1, r'$1 \frac{m}{s}$', labelpos='E',
coordinates='figure')
ax2.scatter(X[::3, ::3], Y[::3, ::3], color='r', s=5)
ax2.plot(X[::3, ::3], Y[::3, ::3], 'r.', ms=3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to keep the explicit kwargs (ie. color='r', marker='.') as then it's obvious what is happening - it's definitely not intuitive what 'r.' means.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switched to ".", c="r" (unfortunately marker="." is not sufficient as one also need to pass linestyle="none", which becomes quite a mouthful).

@timhoffm
Copy link
Member

timhoffm commented May 9, 2019

plot vs scatter has two aspects:

  • The technical one: Are all markers the same?
  • The semantic one: What do we assume the data does represent:
    plot implies some sort of ordering, which is reflected by the default of drawing a line through the sequence of points.
    • scatter assumes the points are uncorrelated and (x, y) are both coordinates. This is also reflected by the fact that scatter accepts 2D arrays, e.g. from meshgrid.

Therefore the note in scatter

The plot function will be faster for scatterplots where markers don't vary in size or color.

does not sufficiently cover the topic, e.g.

x = y = np.linspace(0, 1, 11)
X, Y = np.meshgrid(x, y)

_, (ax1, ax2) = plt.subplots(2, 1)
ax1.plot(X, Y)
ax2.scatter(X, Y)

grafik

I have no clear idea how to accomodate both aspects, but we should not just blindly follow the technical argument.

@anntzer
Copy link
Contributor Author

anntzer commented May 9, 2019

There's actually another reason to prefer plot() to scatter() when variable markers are not needed (but I didn't want to get into the details here): scatter autoscaling on log scale is poor (#3059, #5897, #10782, #11898), and one can easily hit that e.g. when changing the scales interactively. It is possible that #13642 will improve the situation, but it's not merged yet.
Also, I think in many cases the default marker size from plot() is just preferable to the one from scatter(); compare e.g. the scatter_hist.py example before (using scatter()):
old
and after the PR (using plot()):
new
(Obviously, one can change the marker size in either function call, so that's not a very strong argument.)

@timhoffm
Copy link
Member

timhoffm commented May 9, 2019

Marker size is a style choice, it really depends on your application. You may be right with log scaling.

However, in the end, this supports my point: There's more difference between plot and scatter than "plot is a faster substitute for scatter with constant color and marker size".

I'm open which one to use under which circumstances, but that needs clear documentation and instructions. Without that e.g. using plot instead of scatter in an example titled "Scatter plot with histograms" is just plainly confusing.

@anntzer
Copy link
Contributor Author

anntzer commented May 9, 2019

I can just revert the two "scatter hist" examples for now? None of the other examples mention the term "scatter", and I have no doubt there's a bunch of other places in the examples where we use plot(..., ".") to plot a point cloud.

@timhoffm
Copy link
Member

timhoffm commented May 9, 2019

The "Scatter hist" are the two most obvious scatter-like ones. The only clear case in which I really prefer plot is the "Violin plot customization". The other examples semantically plot scattered points. So I wouldn't change them.

I won't give a positive review until we've documented the aspects discussed above #14174 (comment). But if you at least leave out "Scatter hist", I'm not blocking this if others choose it's ok.

@anntzer
Copy link
Contributor Author

anntzer commented May 9, 2019

Reverted the scatter hist examples.

... when markers don't vary in size or shape.

The docstring of scatter() suggests to do so (for performance), so let's
showcase it.
@jklymak jklymak marked this pull request as draft April 23, 2021 16:34
@github-actions
Copy link

Since this Pull Request has not been updated in 60 days, it has been marked "inactive." This does not mean that it will be closed, though it may be moved to a "Draft" state. This helps maintainers prioritize their reviewing efforts. You can pick the PR back up anytime - please ping us if you need a review or guidance to move the PR forward! If you do not plan on continuing the work, please let us know so that we can either find someone to take the PR over, or close it.

@github-actions github-actions bot added the status: inactive Marked by the “Stale” Github Action label Jun 16, 2023
@anntzer
Copy link
Contributor Author

anntzer commented Jun 16, 2023

Let's not bother.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation status: inactive Marked by the “Stale” Github Action status: needs rebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants