Skip to content

Add a fast path for NumPy arrays to Collection.set_verts #16689

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 16, 2020

Conversation

apaszke
Copy link
Contributor

@apaszke apaszke commented Mar 5, 2020

PR Summary

This reduces the run time for larger 3D plots by over a second on a
toy examples linked in #16688. In total #16675, #16688 and this PR lower the run time for that example from ~16s to ~3.3s (4.8x speedup).

PR Checklist

  • Has Pytest style unit tests
  • Code is Flake 8 compliant
  • New features are documented, with examples if plot related
  • Documentation is sphinx and numpydoc compliant
  • Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
  • Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

@anntzer
Copy link
Contributor

anntzer commented Mar 5, 2020

This will conflict with https://github.com/matplotlib/matplotlib/pull/16617/files#diff-506c6bd01694bddbd8999f2c6e920705. Feel free to pick up the relevant patch in my PR, or I can rebase mine on top of yours, whatever works for you.

Copy link
Member

@timhoffm timhoffm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test that both code paths (ndarray and non-ndarray) result in the same self._paths?

@timhoffm timhoffm added this to the v3.3.0 milestone Mar 6, 2020
@apaszke
Copy link
Contributor Author

apaszke commented Mar 6, 2020

@anntzer mine is a very local change, so I'm ok with waiting until yours lands.

@anntzer
Copy link
Contributor

anntzer commented Mar 6, 2020

Either way is fine, but I think you should check whether manually constructing the codes array is significantly faster than using codes=True -- and, if so, leave a comment explicitly asking that the codepath not be "simplified" to codes=True later.

@apaszke
Copy link
Contributor Author

apaszke commented Mar 6, 2020

@anntzer I assume you meant closed=True, right? In this case, yes, creating the codes once is much faster (about 400ms in my benchmark likely due to smaller GC pressure, fewer allocations, etc.).

@timhoffm Added the test.

@apaszke
Copy link
Contributor Author

apaszke commented Mar 6, 2020

@anntzer I've changed the slow path to use closed=True now, but I'm wondering if it wouldn't still be faster to use a dict to memoize codes for each length. I don't care that much about the unoptimized path in my solution though, but I can benchmark it if you want.

@anntzer
Copy link
Contributor

anntzer commented Mar 6, 2020

I guess that could actually be a useful general optimization in path.py? e.g. preallocate the codes for closed paths of up to, say, 16 vertices (I picked 16 because that's the same threshold for caching as in unit_regular_polygon/unit_regular_star).

(In which case perhaps implement that optimization first and see whether you can exploit it here.)

@apaszke
Copy link
Contributor Author

apaszke commented Mar 6, 2020

@anntzer Good point. I'll experiment with that once this lands.

@apaszke
Copy link
Contributor Author

apaszke commented Mar 16, 2020

It would be great if this could get another stamp and land. I've done some more work and it turns out that path creation can be optimized even further! After this PR it takes ~1.4s to create all the paths, but a ~20 LOC change allows to bring this down to ~0.19s. A more hacky version gets this even lower (to ~0.1s), but that is a bit too much for my taste (it requires inlining some function bodies — calls are expensive in Python).

With the PRs I've submitted so far (including this one), the rendering for my example (400x400 surface plot on a 2012 MBP) is ~2s on average. With the changes I have stacked on top of this it drops to ~0.7s. The breakdown of this 0.7s is:

  • ~0.4s in Poly3DCollection.do_3d_projection
    • Including ~0.2s in Collection.set_paths
  • ~0.3s in Collection.draw (mostly spend in the C code for the Agg backend in this case)

I've also done a bit more digging and I think it would be feasible to refactor Collection such that it will always store the data of _paths in a NumPy array (possibly masked). That should eliminate a lot of overhead, both for the do_3d_projection part, as well as the backend.

@apaszke apaszke force-pushed the collection_set_paths_ndarray branch from cfcc6a0 to 9fa83c3 Compare March 16, 2020 10:57
@anntzer
Copy link
Contributor

anntzer commented Mar 16, 2020

Anyone can merge after CI passes (ignoring circleci, which shouldn't be affected by this and is getting fixed (hopefully) in #16784.)

@apaszke
Copy link
Contributor Author

apaszke commented Mar 16, 2020

Not sure what causes the matplotlib.matplotlib Windows failures. OS X and Linux jobs are green and I can't reproduce it locally either. Is that a known issue?

@anntzer
Copy link
Contributor

anntzer commented Mar 16, 2020

that just looks like flakiness. do you want to squash the last commit? either way is fine.

This reduces the run time for larger 3D plots by over a second on some
toy examples.
@apaszke apaszke force-pushed the collection_set_paths_ndarray branch from 9fa83c3 to 95977d6 Compare March 16, 2020 14:27
@apaszke
Copy link
Contributor Author

apaszke commented Mar 16, 2020

All commits are now squashed.

@anntzer anntzer merged commit 314dae4 into matplotlib:master Mar 16, 2020
@anntzer
Copy link
Contributor

anntzer commented Mar 16, 2020

thanks :)

@apaszke apaszke deleted the collection_set_paths_ndarray branch March 16, 2020 16:44
@apaszke
Copy link
Contributor Author

apaszke commented Mar 16, 2020

@anntzer Thanks for all the reviews and help with getting this code in!

@anntzer
Copy link
Contributor

anntzer commented Mar 16, 2020

You're welcome; your PRs are great 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants