Skip to content

Tick rendering speedups #29394

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

scottshambaugh
Copy link
Contributor

@scottshambaugh scottshambaugh commented Jan 1, 2025

PR summary

I'm starting an effort to speed up 3D rendering, spinning up a profiler on some test cases. For this MR, I'm focusing on draw speed for an empty plot.

Based on py-spy profiling, the largest proportion of render time is dedicated to handling tick marks. This PR implements several optimizations to speed up tick drawing. Based on averaging 6x runs of the below script that executes 100x draw calls, the average render time for the basic empty plot decreased from 23.1 ms to 19.1 ms, which is a 21% improvement. Qualitatively, rotating the plot definitely feels snappier on my machine.

import matplotlib.pyplot as plt
import time

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.view_init(0, 0, 0)

plt.show(block=False)
plt.pause(1)

# Start timing
start_time = time.perf_counter()

n = 100
pause_time = 0.1
for i in range(n):
    ax.view_init(0, 0, i*10)
    plt.draw()
    plt.pause(pause_time)

# End timing
end_time = time.perf_counter()

# Calculate actual computation time by subtracting pauses
total_pause_time = n * pause_time
computation_time = (end_time - start_time) - total_pause_time
print(f"Total time: {end_time - start_time:.4f} seconds")
print(f"Computation time (excluding pauses): {computation_time:.4f} seconds")
print(f"Average time per loop: {computation_time / n * 1000:.2f} ms")

Attached are representative before and after profiler runs, which can be uploaded to https://www.speedscope.app/ to view
profile_ticks_speedups.speedscope.json
profile_ticks_base.speedscope.json

PR checklist

@WeatherGod
Copy link
Member

WeatherGod commented Jan 1, 2025 via email

@scottshambaugh scottshambaugh added this to the v3.11.0 milestone Jan 1, 2025
@scottshambaugh
Copy link
Contributor Author

scottshambaugh commented Jan 1, 2025

Redid the profiling in a more stable environment and the improvement wasn't as dramatic, about 48% speedup. The lru_cache was responsible for a little more than half that, but unfortunately it was causing some test failures so I had to take it out. Performance improvement is down to 21%, so still nice but not huge.

@scottshambaugh scottshambaugh marked this pull request as ready for review January 1, 2025 22:16
@@ -1285,7 +1285,7 @@ def _update_ticks(self):
tick.update_position(loc)
tick.label1.set_text(label)
tick.label2.set_text(label)
ticks = [*major_ticks, *minor_ticks]
self._ticks = [*major_ticks, *minor_ticks]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I‘m somewhat skeptical this is a good idea. It’s essentially a caching mechanism for the expensive (?) lookup of the current ticks. This adds more state and makes tick handling more complicated.

On a general note. The fundamental problem is the tick design:

  • Ticks consist of three Line2D objects and two Text objects. This is heavy.
  • Ticks are often created / modified.

This combination makes Ticks expensive. One mitigation strategy we use is that ticks are repurposed instead of being recreated - this is the update_ticks mechanism. The second trick is the lazy behavior of tick lists. Ticks are only created when you need them first. Overall these are still only bandaids on the fundamentally expensive tick design. And the change here goes into the same category.

I believe to substantially improve performance, we have to move away from heavy Tick instances and loops over then. See #5665 (comment). Alternatively/complementary, ticks could get more lightweight with a borg-like pattern.

Copy link
Contributor Author

@scottshambaugh scottshambaugh Jan 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your assessment on Ticks being heavy, and a refactor being the real solution here. The highlighted branch of the call tree here shows the ticks taking up well over half the time associated with the figure & 3d axes creation and drawing.

Screenshot 2025-01-01 180309

Think the extra state isn't worth the ~20% speedup? I don't feel too strongly either way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I‘d like to put this on hold. I plan to pick up the refactoring plan soon, and this additional state would complicate that. We might put the caching then behind the TickCollection abstraction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants