Skip to content

Stem speedup2 #12380

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Feb 15, 2019
Merged

Stem speedup2 #12380

merged 13 commits into from
Feb 15, 2019

Conversation

dstansby
Copy link
Member

@dstansby dstansby commented Oct 3, 2018

Instead of adding each line individually, add them as a line collection. This massively improves performance. Fixes #7969. This is a second go at #9565, with improved deprecation machinery.

@dstansby dstansby added this to the v3.1 milestone Oct 3, 2018
@dstansby dstansby mentioned this pull request Oct 3, 2018
list of `Line2D` objects for stem lines plotted using `ax.stem`. This gives a
very large performance boost to displaying and moving `ax.stem` plots.

This will become the only behaviour in Matplotlib 3.3. To use it now, the
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Version 3.3 is open to discussion, but I thought I'd put something in as a first go.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not say that it will become the only behavior in v3.3. I suggest that in v3.3, the default value for the use_line_collection keyword argument should change from False to True. Then in v3.6 we can make it a no-op (possibly emitting a warning that it is a no-op, or some other predictable behavior).

Copy link
Member

@timhoffm timhoffm Oct 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A no-op doesn't make sense. If someone explicitly uses use_line_collection=False they do it because they rely on single Lines in their code. When we remove the functionality, we also have to error out if use_line_collection != True. At that time, we can deprecate the argument itself and remove it again later on.

But generally, I agree that we should announce to change of the default in 3.3.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, a no-op doesn't make sense. Certainly, we need two deprecations here to help with the transition. Yes, there will be people who won't do anything until v3.3, but at least at that point, they will update their code in a way that works for the previous two releases, and will continue to work for another release, giving a proper off-ramp.

@timhoffm
Copy link
Member

timhoffm commented Oct 3, 2018

I'm a bit afraid of reiterating the discussions in #9565. However, I think it's of value to describe the whole transition process:

  1. Introduce use_line_collection defaulting to False and warn on using use_line_collection=False
  2. Change the default to True
  3. Remove the single line code and only support use_line_collection=True
  4. Deprecate use_line_collection
  5. Remove use_line_collection.

Numbers 3. and 4. might be combined.

Here's the effect on the users:

a) Lazy default user: Does not use the lines for anything. Ignores the warning.
b) Eager default user: Does not use the lines for anything. Silences the warning in 1.
c) Lines user: Works with the resulting Lines/LineCollection.

Step lazy default user eager default user Lines user
1. warned warned, sets use_line_collection=True warned, changes code, set use_line_collection=True
2. -
3. -
4. - warned, deletes use_line_collection warned, deletes use_line_collection
5. -

It's a bit unfortunate that we bother the default users (which I suppose is the majority), but I don't see a better way of handling this so that the Lines user is appropriately warned and can react before we remove the lines functionality.

@dstansby
Copy link
Member Author

dstansby commented Oct 3, 2018

That's a nice clear explanation. With the message I've tried to give it a positive spin, so default users are motivated to use use_line_collection to increase performance.

@ImportanceOfBeingErnest
Copy link
Member

I don't think steps 4. and 5. are necessary. After step 3. one can just silenetly pop use_line_collection=True out of kwargs and error out on use_line_collection=<anything other than True>.

@dstansby dstansby added the Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. label Dec 29, 2018
ids=['w/ line collection', 'w/o line collection'])
@image_comparison(baseline_images=['stem'], extensions=['png'], style='mpl20',
remove_text=True)
def test_stem(use_line_collection):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this may just as well use check_figures_equal to check that the result is the same regardless of use_line_collection, and avoid an additional baseline image?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think ax.stem had an image test before I added this one here, so I think it's maybe worth leaving in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even then I'd rather check equivalency with e.g. manually creating the stem plot with individual calls to plot(), but it's just my preference.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is worth having a visual test here, because it helps a lot when trying to figure out problems (especially with the legend)

Do deprecation better

Fix linestyle
else:
cbook._warn_external(
'In Matplotlib 3.3 individual lines on a stem plot will be '
'added as a LineCollection instead of individual lines.\n'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have explicit carriage returns in our messages. Let the terminal take care of that...

self.update_prop(lm, m, legend)
if using_linecoll:
# update_prop() usually takes two Line2D collections;
# override temporarily to copy properties from a LineCollection
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found this confusing, and had to look elsewhere in code for explanation. Suggest:

"change the function used by update_prop() from the default to one that handles LineCollection"

Copy link
Member

@jklymak jklymak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a couple of small suggestions, but this seems ready to merge

Remove carriage returns....
@jklymak
Copy link
Member

jklymak commented Feb 9, 2019

@dstansby, I pushed a couple of super minor commits.

I won't merge, but someone should when CI passes

@timhoffm
Copy link
Member

Flake8:
./lib/matplotlib/legend_handler.py:616:73: W291 trailing whitespace

@jklymak
Copy link
Member

jklymak commented Feb 10, 2019

Flake8:
./lib/matplotlib/legend_handler.py:616:73: W291 trailing whitespace

I actually think we should not check for this. Its super annoying and really doesn't hurt the code to have an extra whitespace at the end. I understand that in 1995 wasted bits was a bad thing, but now a few extra whitespaces is not going to affect anyone

@QuLogic
Copy link
Member

QuLogic commented Feb 11, 2019

It was never about wasted bits, except if you count unnecessary code churn (i.e., if someone's editor has trim-whitespace on). It can cause unnecessary word wrap when windows are not too wide (e.g., side-by-side diffs), and while some people like cursor-anywhere, I prefer the cursor to only reach the end of a line where it actually logically ends.

See also the rationale in PEP8 (which has nothing to do with byte counts):

Avoid trailing whitespace anywhere. Because it's usually invisible, it can be confusing: e.g. a backslash followed by a space and a newline does not count as a line continuation marker. Some editors don't preserve it and many projects (like CPython itself) have pre-commit hooks that reject it.

@timhoffm
Copy link
Member

Travis failure is unrelated. Would be fixed by a rebase to head. But saving the hassle as it passed before the final (minor) change to placate flake8.

@timhoffm timhoffm merged commit 7ca934f into matplotlib:master Feb 15, 2019
@dstansby
Copy link
Member Author

Hurrah, thanks for everyone who took the time to review this in its various incarnations!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants