-
-
Notifications
You must be signed in to change notification settings - Fork 7.8k
speedup figure rendering removal of .startswith() calls and use generato... #2289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ators instead of for loops
I simplified a bit the testcase so that it only parses input data and goes into figure renderings (hence shorter times to those I originally posted to matplotlib-users). However, I am uploading here some figures from runsnakerun. really nice tool! PATCHED MATPLOTLIB-1.2.1: real 4m58.082s UNPATCHED MATPLOTLIB-1.2.1: real 6m6.821s |
I don't understand this at all. I don't see where the suggested changes are in places that would be called frequently. I think we need to see a minimal example that illustrates the problem. |
@mmokrejs, would you try #2291, please? I think it will speed up your use case. It would be interesting to know how the time in hist compares with what you get after your patch. In your PR, I think the use of comprehensions rather than loops is reasonable, but I don't see how it would make much difference; I still don't see how the loops would be called enough, with enough iterations, to make much difference. The startswith() substitutions would make sense only where they really are bottlenecks, and where they handle all cases. Where they are not slowing things down substantially, it would be better not to make the substitution; you can't beat startswith() for readability. |
The case I see for the list comprehension improvement is for interactions I am more skeptical of the improvements with replacing starts with(), I |
@WeatherGod, regarding the comprehensions in Figure.draw(): they collect only artists added at the Figure level--things like Axes and the suptitle. There are rarely going to be very many of these, so I don't see how changing from the loops to the comprehensions could have a significant effect, except in the unusual case where one is drawing a huge number of artists directly in the Figure, rather than in one or more Axes. |
I did a little timing test to compare this PR to #2291 using timeit in ipython: z = np.random.randn(1000000)
timeit -r1 -n1 plt.hist(z, 10000) The times on my machine are 10.3 s with master, 9.8 s with the present PR, and 7.7 s with #2291. Doing the same thing but with |
Sorry, my mailserver was down for a long while I did not catch up yet. The problem with for loops is that the iterable must be generated before the iteration starts. For huge lists or large items it consumes memory, sometimes all. This is a general pythonic speed/memory requirement improvement so I am surprised you don't believe into that. Same with blah.startswith(). The blah object must be first instantiated and then its method .startswith() can be called. That is slow and every performance tuning guide advices to avoid them. I showed the performance profile from python profiler, rendered graphically through runsnakerun application. From the columns you can infer that I rendered 33 figures. The numbers below show how much time was spent in each function and how many times it was entered. I applied the #2291 over my changes in mpl-1.2.1. The machine is quite loaded now so ignore the real wall clock time but the timing itself should be relatively precise: real 13m55.239s Also from a quick glance into runsnakerun output I see #2291 made things slower but I don't have my wide screen here to give you similarly-sized screenshot. Hmm, I will have to find some time to come up with a testcase. Just generate 500 000 colors and try to assign them to individual legend items, then in another iteration you have to do same for label text ... and you should be there. ;-) |
I re-tested the #2291 patch containing 8bd40e2 in combination with my changes on mpl-1.2.1 and it is a bit faster with #2291. WITHOUT #2291: WITH #2291: ANOTHER ATTEMPT WITH #2291 just confirms same timing: |
@mmokrejs So, the remaining question is whether adding your changes to #2291 causes a substantial speedup over what we have with #2291 alone. Regarding your earlier comment about speedup strategies, it is not a matter of what one believes, it is a matter of tradeoffs among readability, robustness, and speed. |
WITHOUT 8bd40e2: ALSO WITHOUT artist.py.patch: ALSO WITHOUT axes.py.patch: ALSO WITHOUT figure.py.patch (hence vanilla mpl-1.2.1, note the numbers are a bit lower than in comment #22426060 as I tweaked meanwhile my app to use more frequently izip(), imap(), ifilter() from itertools): |
@mmokrejs Perhaps I am misunderstanding, but I don't see that these tests address the question of whether some of the patches in this PR should be merged. If not, I would like to close the PR. |
The PR is a patch against master but I mention the filenames for 1.2.1 as it is easier for me to remember. The difference is that in 1.3 your renames axes.,py to _axes.py etc. Anyway, the changes I backed out my changes one by one and the runtime times increased as expected. In other words, artist.py.patch saves by itself 4m20 - 3m33 = 47sec in addition to your #2291. The axes.py.patch and figure.py.patch have marginal effect (in terms of execution time). |
I'm sorry, but this still is not making sense to me. I understand the conclusion that axes.py.patch and figure.py.patch have marginal effect, as I expected, because they are trying to optimize code that contributes little to the total time. Regarding artist.py.patch, however, I don't see that you have compared #2291 with that patch to #2291 without it, which is the only relevant comparison at this point. The code affected by artist.py.patch is bypassed by #2291, so I don't expect to see any effect of artist.py.patch when #2291 is in place. |
So mpl-1.2.1 with just a patch from #2291 gives: real 3m32.522s I still think the code in figure.py should be fixed because it is just ugly doing appends in a loop, does not matter it is bypassed now in my test. But I don't have time to invent a testcase for that. Few lines below where my patch ends is another place where just some of the pre-filtered data are used. The filter should be made into those lines which are changed in a "decent" way. |
So, I did some experiments to see if there were any improvements in interactivity as I suspected. While this was a completely subjective test, I really didn't notice any difference in interactivity. I even tried fixing a few instances of "append" in mplot3d, and it just didn't seem to make a noticible difference. I still thing the changes to figure.py should definitely be accepted, but given that "startswith" and "in" are semantically different, I just don't see the justification for that. I do think it would be a worthwhile endevor, however, to rigorously profile and document the code so that we can take a more holistic approach at identifying ways to improve our performance, much like Eric did. I would be curious to see just how much time is spent in different modules for the initial draw (and subsequent redraws). I would be curious to know how those times scale with increasing number of elements and data. Might be a useful Summer of Code project for next year, maybe? |
@mmokrejs: I've just reread this whole thread, and I still don't see the justification for it, as @efiring suggested. The question is -- is there significant marginal improvement with this patch + #2291 vs. #2291 alone. I don't think that any of your benchmarks make that comparison, unless I'm missing something. As @efiring said, this is a tradeoff between code readability/maintainability vs. speed here. I'm willing to sacrifice some readability for speed if the effect is significant, but I'm just convinced that it is. In general, I'm fine with the changes to list comprehensions whether they result in significant performance improvements or not (I'm quite certain that original code predates list comprehensions), but the changes of To address @WeatherGod's comments: I've done a little exploration of some benchmark management tools, such as codespeed (from PyPy) and vbench (from Pandas/Wes McKinney). I think vbench is closer to what we want, and it will track benchmark performance over the history of a git repo. But it appears to be a good chunk of work to adapt it to our needs. As you say, not a bad sized project for Google Summer of Code, or any other interested party. It would be nice to have a benchmark suite in any event -- with some of the performance work I've been doing (such as #2156) it would be handy to know that as I'm optimizing for one benchmark that I'm not regressing another, for example. |
I think this needs to be closed and re-opened with appropriate justification for reducing the code's readability. Currently I do not see any sizeable benefit to replacing |
speedup figure rendering removal of .startswith() calls and use generators instead of for loops
Please see the email thread:
[Matplotlib-users] mpl-1.2.1: Speedup code by removing .startswith() calls and some for loops
for more details.