-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Stem performance boost #9565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stem performance boost #9565
Conversation
Looks like some extra files got included in this commit? |
lib/matplotlib/container.py
Outdated
markerline_stemlines_baseline : tuple | ||
Tuple of ``(markerline, stemlines, baseline)``. | ||
``markerline`` contains the `LineCollection` of the markers, | ||
``stemlines`` is a list of the `Line2D` of each line, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this doc entry be updated to say LineCollection? Does it matter? Should we coerce to one type or another?
BTW, not saying it will be fixed - it looks like legend_demo wants a list of lines.... |
lib/matplotlib/legend_handler.py
Outdated
l = Line2D([thisx, thisx], [bottom, thisy]) | ||
leg_stemlines.append(l) | ||
|
||
for lm, m in zip(leg_stemlines, stemlines): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell these two lines aren't needed; not 100% sure though...
c68e2c8
to
b7cdd7f
Compare
I've added a legend to the image test just so |
b7cdd7f
to
6002b62
Compare
------------------------------------------- | ||
|
||
`StemContainer` objects now store a `LineCollection` object instead of a list | ||
of `Line2D` objects for stem lines plotted using `ax.stem`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you expand on the consequences of this a bit? I would have to go look up the code to see how to do a migration.
Adding a note about the speed up would help mollify the angry user who's code we just broke ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've had a go, but not too sure on what to put - if anyone has a better suggestion feel free to push to my branch.
x = np.linspace(0.1, 2 * np.pi, 100)
fig, ax = plt.subplots()
ax.stem(x, np.cos(x), linefmt='C0-', markerfmt='k+', basefmt='C1-.',
label='Stem')
ax.stem(x, np.sin(x), linefmt='k-', markerfmt='r+', basefmt='C1-.',
label='Sine')
ax.legend() Has a slight blemish in the legend in that the |
4d7ff73
to
86de3d4
Compare
@@ -619,6 +624,13 @@ def create_artists(self, legend, orig_handle, | |||
|
|||
return artists | |||
|
|||
def _copy_collection_props(self, legend_handle, orig_handle): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I managed to fix the legend properties by adding this function, which kind of works.
I've made a slightly hacky fix for the legend. What I really want to do is find a way of extracting a property cycle from the |
large performance boost to displaying and moving `ax.stem` plots. | ||
|
||
Line segments can be extracted from the `LineCollection` using | ||
`LineCollection.get_segements()`. See the `LineCollection` documentation for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo segments
Can you clarify what you need, and why you're not satisfied with your current implementation of the legend handler? |
Hmm, I'm not sure anymore, maybe this way is fine. |
So the legend is slightly off in that they elements are drawn in the wrong order; can that be fixed? This also needs a rebase. |
#9565 (comment) still stands - the stem and marker are drawn in the wrong zorder in the legend. Otherwise this looks fine to me. |
I don't recall the exact conversation, but I think it was considered worth breaking the backwards compatibility. I guess what would help in the API change note is a quick summary of before and after of how to do something like change the line width... I can't imagine its too painful - i.e. old |
I think we should consider making the transition more smoothly. Maybe an easy way to provide a two cycle warning is to duplicate the code into a new function, provide the fast version under a new name, and do a two cycle deprecation of this one? I don't know where the name |
Just to mention, this is exactly the way you end up with things like |
eh, not exactly. pcolor and pcolormesh are two different things and one was
not intended to replace the other. pcolormesh is faster for a subset of
things, so we couldn't get rid of something that works all the time. Now,
pcolorfast() was a bad design choice because it was trying to figure out
which one to use in what situation.
What we have here is something that will always be faster and more
efficient and we want to deprecate the old approach. We could call this
stem2(), or stemfast() perhaps. And then go a few cycles and get rid of
stem(). A few cycles later, we can take back stem().
…On Fri, Jul 13, 2018 at 7:28 PM, Elan Ernest ***@***.***> wrote:
... to duplicate the code into a new function, provide the fast version
under a new name ...
Just to mention, this is exactly the way you end up with things like
pcolor, pcolormesh, pcolorfast and in the end noone will know which one
is the one to actually use.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#9565 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AARy-Ip4UMfe_DH0fd5byAbvtnQoIpOuks5uGS0KgaJpZM4QE_Di>
.
|
The solution I proposed is also the first thing that came to my mind. There may be a better way to ensure backward compatibility for a few release cycles. If LineCollections can be easily (or not) converted on the fly to list of Line2D, then we may be able to exploit that: a solution would then be to use another argument name than (To be clear, I am not suggesting to use the name |
I don't have any bandwidth to work out how to do the deprecation properly, so anyone feel free to push to my branch (please don't rewrite what's already there though) |
Punting to 3.1 to give time to sort out this discussion / put in a more gracefully change strategy. |
Had a little think about this; as I've said in the API note,
Which is me trying to say that this is how you get the old return type from the new return type, though re-reading it it's not that clear so I will re-word it. I think @ImportanceOfBeingErnest has the best suggestion, add a kwarg that changes the return type, set as default to the current return type, and warn that the return type will change in 2 release cycles. |
The backward compatible kwarg option would only work if by default, stem was not using the optimized version of the code. I'm not so concerned about providing the user with an option to switch back to the old API, but about giving proper warning to the user on the API breakage. |
This PR is still lingering - I don't really know how we give warning that we will deprecate the returned type of a method. Don't we just change it and in the API note give the way to get the old API back? |
@jklymak I don't think this solution has any advantages. If you are going to break someone's code by default and they have to change it anyways to have their code run, you might as well force them to use the new API. When your library is used by hundreds of thousands of users, you need to be careful about these small modifications that seem minor because you are a core developer and following closely the development of Matplotlib. As a reminder, we broke people's code several time by removing unused module imports in our submodules: while I think we made the right move in removing those unused import, it also underlies the importance of backward compatibility and how small changes affect our users. The minimal amount of backward compatibility we should aim for is to be able to run the tests and build the document from two version ago with the current version without the tests crashing (understand error, not failures). |
I've put on the agenda for Monday's call... I'm not advocating for what @dstansby suggested - I'm advocating we just break the API. If there is a good way to do that smoothly, I'm all for that. |
I don't agree with this point.
In other words, it gives the users a simple way to temporarily use the old API. You can even make this an rcparam if you want...
You proposed #9565 (comment), which is essentially (sorry if I misunderstood the proposal?) saying provide Edit: Apologies, I missed the proposal in #9565 (comment), which may or may not work in this specific case, but more generally there are other proposed API changes (e.g. #5665) where shimming the entire old API just doesn't make sense; we should just accept that at some point an argument's default value switches from one value to another. |
We could always name the new function `stemplot()`. That won't clutter up
the API with silly looking function calls.
We have been bitten several times before with attempts to change the return
dtype. There are several other libraries that are built on top of us that
provide extra introspective plotting features that depend heavily on the
return type of our functions.
…On Mon, Oct 1, 2018 at 6:06 AM Antony Lee ***@***.***> wrote:
I don't think this solution has any advantages. If you are going to break
someone's code by default and they have to change it anyways to have their
code run, you might as well force them to use the new API.
I don't agree with this point.
- If they were just using stem() to plot, well, a stem plot and not do
any customization on the artist, then changing the return type doesn't
change anything for them. I'd bet this covers >95% of the use cases. (Of
course, 5% of hundreds of thousands of users is still a lot.)
- If they did, then they'll get a crash (because the returned type is
quite different) (unless they're exactly using the artist API that both the
old and new return type share, in which case things should *hopefully*
not change, but if the APIs are not consistent that should be seen as an
opportunity to rationalize our API). But then they can just replace
stem(...) by stem(..., use_old_api=True) as a quickfix (until we kill
the old return type), rather than figuring out how to replace the old API
by the new one.
In other words, it gives the users a simple way to temporarily use the old
API. You can even make this an rcparam if you want...
The proper way is to deprecate the old method. There are several ways to
do this, and it is just a question of taking the time to do it.
You proposed #9565 (comment)
<#9565 (comment)>,
which is essentially (sorry if I misunderstood the proposal?) saying
provide stem_v2 (name up to bikeshedding) with the new semantics and
deprecate stem; I guess we'd later rename stem_v2 to stem? Frankly I
don't like this approach at all; if you don't ever rename the function to
the old name you end up with silly names such as EnumCalendarInfoExEx
<https://docs.microsoft.com/en-us/windows/desktop/api/winnls/nf-winnls-enumcalendarinfoexex>
(and stem_v123 isn't really helpful to the code reader either) and if you
do that makes the diligent programmer (the one who actually follows each
upgrades and cleans up after each deprecation warning) do two changes
(stem->stem_v2->stem) instead of a single one (take care of the API change).
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#9565 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AARy-HQFJH3secTiYJsh9Vr3rGFCk2NHks5ugekZgaJpZM4QE_Di>
.
|
Having subtly different |
We have kind of a similar thing going on in the current versions. If you try to create an axes at the same subplot position as an existing axes, it will fall back to this existing axes.
There is a warning associated with this though:
Which means if you want the (expected and useful) behaviour of creating a new axes at the same position, you need to add a label (or in fact any other unique argument) to the creating function.
So here the old behaviour is kept, but a strategy of using the new behaviour is presented, i.e. by using a
|
@anntzer We can immediately deprecate stem. That's a fairly common practice, and provides backward compatibility. I'm also fine with the solution proposed by @ImportanceOfBeingErnest . |
I'd much rather do what @ImportanceOfBeingErnest proposed just above. Edit: see also https://gitter.im/matplotlib/matplotlib?at=5bb250d81c100a4f291607c7 re: backcompat policy. Reproduced here: I assume @NelleV is raising the point wrt #9565. My personal opinion is that while 2 minor versions' (or 1y) worth of deprecation warnings is fine in general, we should also weight the relative costs of actually putting in the relevant deprecation machinery vs the gains that the API changes gets us, rather than seeing this as an absolutely strict rule. |
I'm also -1 on creating a new function with a modified name and deprecating Also, changing the return value without a prior warning is a massive break. The |
Sorry to bikeshed a bit, but if it's going to change only the return value, then it can be called |
Well, of course it changes what's in the plot. Two important consquences of this:
Also replicating a recent comment in chat here for completeness: |
We can't just make this change with no deprecation period. Full stop. This is not fixing a bug, or making a minor improvement and changes how the end image appears somewhat. This is a significant change to how the library is operating in regards to how the call to The performance implications are worth making the change, certainly. But, IMO, we don't get to just throw up our hands and say "Doing a deprecation period is too hard". It's possible that code that did all the right things (in the standard code path) with 3.0 will no longer work with 3.1--and will end up in a traceback. As a concrete example, this would completely obliterate code that was making an animated stem plot. That's user-hostile behavior, IMO. If somebody made a similar change to What exactly is wrong with: def stem(self, *args, **kwargs, use_new_behavior=None):
if use_new_behavior is None:
warnings.warn("we're changing this")
use_new_behavior = False
if use_new_behavior:
self._new_stem(*args, **kwargs)
else:
self._old_stem(*args, **kwargs) (naming stuff aside) ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to do something more about the API change rather than just a note in the docs. This isn't changing plot appearance, this has the ability to making existing, working scripts, traceback.
@dopplershift That's literally what was proposed in #9565 (comment), and that was deemed no-good-enough in #9565 (comment), at which point I did "throw up our hands and said "Doing a deprecation period is too hard"." But I admittedly misread both the comment and its implication that #9565 (comment) would be OK (sorry about that). Not to derail this discussion, but the recent |
Since the conversation here is getting a bit ridiculous, I've opened a new PR at #12380 |
Instead of adding each line individually, add them as a line collection. This massively improves performance. Fixes #7969
StemContainer
from a list ofLine2D
to aLineCollection
(have added API change to reflect this)linemarker
this way