Skip to content

Speed up Path.iter_segments() #13039

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 24, 2018
Merged

Conversation

lazka
Copy link
Contributor

@lazka lazka commented Dec 22, 2018

PR Summary

Using the Gtk3Cairo backend with wire3d_animation_sgskip.py:

before: 9.16 fps
iter: 9.95 fps
iter+types: 15.26 fps

The main speedup comes from iterating and keeping the common non-curve
case simple and from converting the code type constants to the same type as the codes array to make comparisons between them faster.

PR Checklist

  • Has Pytest style unit tests
  • Code is Flake 8 compliant
  • New features are documented, with examples if plot related
  • Documentation is sphinx and numpydoc compliant
  • Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
  • Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

@lazka lazka force-pushed the speed-up-iter-segments branch from dcb5fbb to a798344 Compare December 23, 2018 11:11
lazka added a commit to lazka/matplotlib that referenced this pull request Dec 23, 2018
With the performance improvements in matplotlib#13040 and matplotlib#13039 the old slow path
is now faster than the previously fast one. And it also works with
pycairo.
@timhoffm
Copy link
Member

timhoffm commented Dec 23, 2018

Welcome to matplotlib development and thanks for your contribution.
You've found a good performance improvement!

Comparing different types costs extra time as the following tests show:

native (int, np.int64)

In [16]: %%timeit ref = 5
    ...: for i in np.arange(1_000_000):
    ...:     if i == ref:
    ...:         continue
    ...:     
887 ms ± 49.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

int reference, np.uint8 array:

In [17]: %%timeit ref = 5
    ...: for i in np.arange(1_000_000, dtype=np.uint8):
    ...:     if i == ref:
    ...:         continue
    ...:     
4.59 s ± 87.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

np.uint8 reference, np.uint8 array:

In [18]: %%timeit ref = np.uint8(5)
    ...: for i in np.arange(1_000_000, dtype=np.uint8):
    ...:     if i == ref:
    ...:         continue
    ...:     
443 ms ± 23.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

np.uint64 reference, np.uint64 array:

In [19]: %%timeit ref = np.uint64(5)
    ...: for i in np.arange(1_000_000, dtype=np.uint64):
    ...:     if i == ref:
    ...:         continue
    ...:     
475 ms ± 36.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

codes seem to be always uint8 (there's a class variable Path.code_type, which is unfortunately public, but most likely only intended for reading not for writing. Also the Path docstring says codes should be utf8). I recommend changing Path.STOP and Path.NUM_VERTICES_FOR_CODE to this type to avoid casting the individual elements at all, which should make the code even a bit more faster.

Using the Gtk3Cairo backend with wire3d_animation_sgskip.py:

Before: 9.16 fps
After: 9.95 fps

The main speedup is from iterating and keeping the common non-curve
case simple.
@lazka
Copy link
Contributor Author

lazka commented Dec 23, 2018

Thanks for the benchmarks, using uint8 seems much faster indeed.

Small question, do you mean adjusting Path.STOP, or only casting it in iter_segments(). The former would make many things using the constants faster everywhere I guess, but I'm not sure regarding compatibility.

edit: I've pushed the direct change for starters.

@lazka lazka force-pushed the speed-up-iter-segments branch from a798344 to ce7a6a6 Compare December 23, 2018 18:02
@timhoffm
Copy link
Member

timhoffm commented Dec 23, 2018

Yes, I meant the direct change.

I'm 99% sure about compatibility. uint8 can drop in wherever you expect an int and the predefined constants are small enough and don't change so that we don't have to worry about overflows. Even if somebody should extend this with own implementations and use int, everything should still work.

We should probably add an API change note to be on the safe side.

That said, I'd still want the opinion of other devs on this.

… the codes array

The matching types make comparisons between the constants and values of the codes array faster.

Using the Gtk3Cairo backend with wire3d_animation_sgskip.py:
Before: 9.95 fps
After: 15.26 fps

The main areas where this helps for the cairo case is the faster comparisons in
Path.iter_segments() and in _append_paths() of the cairo backend.
@lazka lazka force-pushed the speed-up-iter-segments branch from ce7a6a6 to 3b8689a Compare December 23, 2018 19:09
@lazka
Copy link
Contributor Author

lazka commented Dec 23, 2018

We should probably add an API change note to be on the safe side.

I've added something to next_api_changes

Copy link
Member

@timhoffm timhoffm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Out of curiosity, have you benchmarked the new version?

@timhoffm timhoffm added this to the v3.1 milestone Dec 23, 2018
@lazka
Copy link
Contributor Author

lazka commented Dec 24, 2018

Out of curiosity, have you benchmarked the new version?

Only using wire3d_animation_sgskip + pycairo, see the PR summary and commits. (best out of 5 each time)

@tacaswell tacaswell merged commit 87085bd into matplotlib:master Dec 24, 2018
lazka added a commit to lazka/matplotlib that referenced this pull request Dec 25, 2018
With the performance improvements in matplotlib#13039 the old slow path is now faster
than the previously fast one. And it also works with pycairo.

Using the Gtk3Cairo backend with wire3d_animation_sgskip.py:

cairocffi + append_fast: 13.27 fps
cairo + append_slow: 15.07 fps
cairocffi + append_slow: 13.54 fps
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants