speedup figure rendering removal of .startswith() calls and use generato... #2289

mmokrejs · 2013-08-09T21:36:19Z

speedup figure rendering removal of .startswith() calls and use generators instead of for loops

Please see the email thread:
[Matplotlib-users] mpl-1.2.1: Speedup code by removing .startswith() calls and some for loops
for more details.

…ators instead of for loops

mmokrejs · 2013-08-09T22:08:36Z

I simplified a bit the testcase so that it only parses input data and goes into figure renderings (hence shorter times to those I originally posted to matplotlib-users). However, I am uploading here some figures from runsnakerun. really nice tool!

PATCHED MATPLOTLIB-1.2.1:

real 4m58.082s
user 3m46.420s
sys 1m11.350s

UNPATCHED MATPLOTLIB-1.2.1:

real 6m6.821s
user 4m39.770s
sys 1m26.490s

mmokrejs · 2013-08-09T22:26:56Z

Oh, disabled javascript to s3.amazonaws.com prevented figure uploads. ;-)

efiring · 2013-08-10T00:07:00Z

I don't understand this at all. I don't see where the suggested changes are in places that would be called frequently. I think we need to see a minimal example that illustrates the problem.

efiring · 2013-08-10T21:59:10Z

@mmokrejs, would you try #2291, please? I think it will speed up your use case. It would be interesting to know how the time in hist compares with what you get after your patch.

In your PR, I think the use of comprehensions rather than loops is reasonable, but I don't see how it would make much difference; I still don't see how the loops would be called enough, with enough iterations, to make much difference.

The startswith() substitutions would make sense only where they really are bottlenecks, and where they handle all cases. Where they are not slowing things down substantially, it would be better not to make the substitution; you can't beat startswith() for readability.

WeatherGod · 2013-08-11T01:17:23Z

The case I see for the list comprehension improvement is for interactions
with the figure. Any way to improve responsiveness to panning and zooming
is a good thing. (Also good for mplot3d and animations).

I am more skeptical of the improvements with replacing starts with(), I
would need to have some benchmarking code first.

efiring · 2013-08-11T01:46:36Z

@WeatherGod, regarding the comprehensions in Figure.draw(): they collect only artists added at the Figure level--things like Axes and the suptitle. There are rarely going to be very many of these, so I don't see how changing from the loops to the comprehensions could have a significant effect, except in the unusual case where one is drawing a huge number of artists directly in the Figure, rather than in one or more Axes.

efiring · 2013-08-11T02:10:54Z

I did a little timing test to compare this PR to #2291 using timeit in ipython:

z = np.random.randn(1000000)
timeit -r1 -n1 plt.hist(z, 10000)

The times on my machine are 10.3 s with master, 9.8 s with the present PR, and 7.7 s with #2291. Doing the same thing but with alpha=0.5 adds 0.2 to 0.3 s to each, but doesn't change the basic result. I haven't tried combining the two PRs, but my prediction is that with #2291 in place, the present PR will have no detectable effect. All the overhead is being pulled in by the use of setp for each Patch initialization, so when we get rid of setp, we are no longer calling startswith a zillion times.

mmokrejs · 2013-08-16T14:45:40Z

Sorry, my mailserver was down for a long while I did not catch up yet. The problem with for loops is that the iterable must be generated before the iteration starts. For huge lists or large items it consumes memory, sometimes all. This is a general pythonic speed/memory requirement improvement so I am surprised you don't believe into that.
http://wiki.python.org/moin/PythonSpeed/PerformanceTips#Loops
http://wiki.python.org/moin/PythonSpeed/PerformanceTips#Use_xrange_instead_of_range

Same with blah.startswith(). The blah object must be first instantiated and then its method .startswith() can be called. That is slow and every performance tuning guide advices to avoid them.
http://wiki.python.org/moin/PythonSpeed/PerformanceTips#Avoiding_dots...

I showed the performance profile from python profiler, rendered graphically through runsnakerun application. From the columns you can infer that I rendered 33 figures. The numbers below show how much time was spent in each function and how many times it was entered.

I applied the #2291 over my changes in mpl-1.2.1. The machine is quite loaded now so ignore the real wall clock time but the timing itself should be relatively precise:

real 13m55.239s
user 4m47.670s
sys 0m41.440s

Also from a quick glance into runsnakerun output I see #2291 made things slower but I don't have my wide screen here to give you similarly-sized screenshot.

Hmm, I will have to find some time to come up with a testcase. Just generate 500 000 colors and try to assign them to individual legend items, then in another iteration you have to do same for label text ... and you should be there. ;-)

mmokrejs · 2013-08-22T18:53:54Z

I re-tested the #2291 patch containing 8bd40e2 in combination with my changes on mpl-1.2.1 and it is a bit faster with #2291.

WITHOUT #2291:
real 4m40.369s
user 3m34.720s
sys 1m6.010s

WITH #2291:
real 3m33.499s
user 2m42.770s
sys 0m49.790s

ANOTHER ATTEMPT WITH #2291 just confirms same timing:
real 3m34.913s
user 2m42.080s
sys 0m48.000s

efiring · 2013-08-22T19:46:52Z

@mmokrejs So, the remaining question is whether adding your changes to #2291 causes a substantial speedup over what we have with #2291 alone.

Regarding your earlier comment about speedup strategies, it is not a matter of what one believes, it is a matter of tradeoffs among readability, robustness, and speed.

mmokrejs · 2013-08-22T21:09:17Z

WITHOUT 8bd40e2:
real 4m39.169s
user 3m33.360s
sys 1m6.070s

ALSO WITHOUT artist.py.patch:
real 5m38.867s
user 4m20.270s
sys 1m18.860s

ALSO WITHOUT axes.py.patch:
real 5m41.987s
user 4m23.550s
sys 1m18.650s

ALSO WITHOUT figure.py.patch (hence vanilla mpl-1.2.1, note the numbers are a bit lower than in comment #22426060 as I tweaked meanwhile my app to use more frequently izip(), imap(), ifilter() from itertools):
real 5m37.166s
user 4m19.510s
sys 1m17.950s

efiring · 2013-08-22T22:20:15Z

@mmokrejs Perhaps I am misunderstanding, but I don't see that these tests address the question of whether some of the patches in this PR should be merged. If not, I would like to close the PR.

mmokrejs · 2013-08-22T22:32:06Z

The PR is a patch against master but I mention the filenames for 1.2.1 as it is easier for me to remember. The difference is that in 1.3 your renames axes.,py to _axes.py etc. Anyway, the changes I backed out my changes one by one and the runtime times increased as expected. In other words, artist.py.patch saves by itself 4m20 - 3m33 = 47sec in addition to your #2291. The axes.py.patch and figure.py.patch have marginal effect (in terms of execution time).

efiring · 2013-08-22T22:44:58Z

I'm sorry, but this still is not making sense to me. I understand the conclusion that axes.py.patch and figure.py.patch have marginal effect, as I expected, because they are trying to optimize code that contributes little to the total time. Regarding artist.py.patch, however, I don't see that you have compared #2291 with that patch to #2291 without it, which is the only relevant comparison at this point. The code affected by artist.py.patch is bypassed by #2291, so I don't expect to see any effect of artist.py.patch when #2291 is in place.

mmokrejs · 2013-08-23T22:27:37Z

So mpl-1.2.1 with just a patch from #2291 gives:

real 3m32.522s
user 2m38.900s
sys 0m51.960s

I still think the code in figure.py should be fixed because it is just ugly doing appends in a loop, does not matter it is bypassed now in my test. But I don't have time to invent a testcase for that. Few lines below where my patch ends is another place where just some of the pre-filtered data are used. The filter should be made into those lines which are changed in a "decent" way.

WeatherGod · 2013-08-25T20:47:54Z

So, I did some experiments to see if there were any improvements in interactivity as I suspected. While this was a completely subjective test, I really didn't notice any difference in interactivity. I even tried fixing a few instances of "append" in mplot3d, and it just didn't seem to make a noticible difference. I still thing the changes to figure.py should definitely be accepted, but given that "startswith" and "in" are semantically different, I just don't see the justification for that.

I do think it would be a worthwhile endevor, however, to rigorously profile and document the code so that we can take a more holistic approach at identifying ways to improve our performance, much like Eric did. I would be curious to see just how much time is spent in different modules for the initial draw (and subsequent redraws). I would be curious to know how those times scale with increasing number of elements and data. Might be a useful Summer of Code project for next year, maybe?

mdboom · 2013-08-26T14:01:40Z

@mmokrejs: I've just reread this whole thread, and I still don't see the justification for it, as @efiring suggested. The question is -- is there significant marginal improvement with this patch + #2291 vs. #2291 alone. I don't think that any of your benchmarks make that comparison, unless I'm missing something. As @efiring said, this is a tradeoff between code readability/maintainability vs. speed here. I'm willing to sacrifice some readability for speed if the effect is significant, but I'm just convinced that it is. In general, I'm fine with the changes to list comprehensions whether they result in significant performance improvements or not (I'm quite certain that original code predates list comprehensions), but the changes of startswith would require really significant speed improvement to be justifiable, and I'm just not seeing it here.

To address @WeatherGod's comments: I've done a little exploration of some benchmark management tools, such as codespeed (from PyPy) and vbench (from Pandas/Wes McKinney). I think vbench is closer to what we want, and it will track benchmark performance over the history of a git repo. But it appears to be a good chunk of work to adapt it to our needs. As you say, not a bad sized project for Google Summer of Code, or any other interested party. It would be nice to have a benchmark suite in any event -- with some of the performance work I've been doing (such as #2156) it would be handy to know that as I'm optimizing for one benchmark that I'm not regressing another, for example.

pelson · 2013-09-19T11:10:49Z

I think this needs to be closed and re-opened with appropriate justification for reducing the code's readability. Currently I do not see any sizeable benefit to replacing .startswith with [:4] == 'get_'.

speedup figure rendering removal of .startswith() calls and use gener…

0c62ae5

…ators instead of for loops

pelson closed this Sep 19, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speedup figure rendering removal of .startswith() calls and use generato... #2289

speedup figure rendering removal of .startswith() calls and use generato... #2289

mmokrejs commented Aug 9, 2013

mmokrejs commented Aug 9, 2013

mmokrejs commented Aug 9, 2013

efiring commented Aug 10, 2013

efiring commented Aug 10, 2013

WeatherGod commented Aug 11, 2013

efiring commented Aug 11, 2013

efiring commented Aug 11, 2013

mmokrejs commented Aug 16, 2013

mmokrejs commented Aug 22, 2013

efiring commented Aug 22, 2013

mmokrejs commented Aug 22, 2013

efiring commented Aug 22, 2013

mmokrejs commented Aug 22, 2013

efiring commented Aug 22, 2013

mmokrejs commented Aug 23, 2013

WeatherGod commented Aug 25, 2013

mdboom commented Aug 26, 2013

pelson commented Sep 19, 2013

speedup figure rendering removal of .startswith() calls and use generato... #2289

speedup figure rendering removal of .startswith() calls and use generato... #2289

Conversation

mmokrejs commented Aug 9, 2013

mmokrejs commented Aug 9, 2013

mmokrejs commented Aug 9, 2013

efiring commented Aug 10, 2013

efiring commented Aug 10, 2013

WeatherGod commented Aug 11, 2013

efiring commented Aug 11, 2013

efiring commented Aug 11, 2013

mmokrejs commented Aug 16, 2013

mmokrejs commented Aug 22, 2013

efiring commented Aug 22, 2013

mmokrejs commented Aug 22, 2013

efiring commented Aug 22, 2013

mmokrejs commented Aug 22, 2013

efiring commented Aug 22, 2013

mmokrejs commented Aug 23, 2013

WeatherGod commented Aug 25, 2013

mdboom commented Aug 26, 2013

pelson commented Sep 19, 2013