Skip to content

Improve rendering when using multiple / large labels #21958

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

ojeda-e
Copy link
Contributor

@ojeda-e ojeda-e commented Dec 15, 2021

PR Summary

Fixes #21895
The example reported in #21895 initially runs in 10.079 s (locally, my old laptop). By moving cache in lib/matplotlib/text.py as an attribute of Text, the example runs in 4.426 s, which translates to an improvement of ~78%. Added a unit test that uses labels of different lengths, increasing as 10**i, and assert times are shorter. Not quite as the suggestion by @tacaswell because I am not comparing text vs no text. I'm not fully convinced of this approach either, but can't think of anything better. I appreciate suggestions to make the test more solid.
No changes in matplotlib/lib/matplotlib/mathtext.py.
As a separate note, this also probably needs a benchmark in mpl_benchmarks as suggested by @jklymak on Gitter (thanks Jody!).

PR Checklist

Tests and Styling

  • Has pytest style unit tests (and pytest passes).
  • Is Flake 8 compliant (install flake8-docstrings and run flake8 --docstring-convention=all).

Documentation

  • [N/A] New features are documented, with examples if plot related.
  • [N/A] New features have an entry in doc/users/next_whats_new/ (follow instructions in README.rst there).
  • [N/A] API changes documented in doc/api/next_api_changes/ (follow instructions in README.rst there).
  • [N/A] Documentation is sphinx and numpydoc compliant (the docs should build without error).

@jklymak
Copy link
Member

jklymak commented Dec 15, 2021

This seems reasonable.

Unfortunately, the figure doesn't survive a pickle - i.e. with the new cache, the objects are not serializable. I expect the figure used to have something that detached the cache, and you could likely do the same thing here? But I'm not an expert on (nor a fan of) pickling figures ;-)

@ojeda-e
Copy link
Contributor Author

ojeda-e commented Dec 15, 2021

Thanks for the brief explanation @jklymak.
I managed to pickle the weakref by adding the __setstate__ in Text. Using the same example from the issue, the improvement is conserved. with this change, test_pickle.py passes locally.
However, some of my local tests are failing, even in the main branch. I pushed the changes and will use the CI to check if there are more tests failing. (Sorry about that, I know it isn't a good practice, but it will help me while I find out what broke in my environment.)

@jklymak
Copy link
Member

jklymak commented Dec 15, 2021

Many tests fail locally for me as well. If you are pretty sure it's not due to your pr it's perfectly fine to push tothe more controlled test environment.

@ojeda-e
Copy link
Contributor Author

ojeda-e commented Dec 16, 2021

Thanks for the comments @tacaswell, I hope I addressed your comments with the recent changes.

@anntzer
Copy link
Contributor

anntzer commented Dec 16, 2021

I wonder(?) if it may make sense to keep the old global cache layer as well, so that if you're e.g. drawing 10 axes each with the same tick labels, the labels on the latter axes still benefit from the work done on the first axes. Of course, that would require a bit more rearchitecting of the caching...

@anntzer anntzer mentioned this pull request Jan 20, 2022
6 tasks
@@ -107,7 +107,6 @@ class Text(Artist):
"""Handle storing and drawing of text in window or data coordinates."""

zorder = 3
_cached = cbook.maxdict(50)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this dict really a memory hog if it gets larger? Why not just set to a large number and move on? Just because we can make a double caching structure doesn't mean we should. Given that we have had very few complaints of this nature, my assumption is that most people are not hitting the 50-element limit, so bumping it to 5000 for the few people who need it doesn't seem like a terrible idea.

Note that this largely catches users who have accidentally used categoricals and created thousands of ticks. I think, if anything, we should add logic in the categorical ticker that raises a warning of more than 100 ticks are being asked for.

@QuLogic QuLogic modified the milestones: v3.6.0, v3.7.0 Jul 5, 2022
@QuLogic
Copy link
Member

QuLogic commented Jul 5, 2022

@anntzer what is the status here after #22271? Does this just need a rebase?

@anntzer
Copy link
Contributor

anntzer commented Jul 11, 2022

No, because then there was #22323 which further got rid of the explicit _cached in favor of a lru_cache. There may(?) still be some small benefit to caching layout info per-instance, although restoring that (which was mostly removed in #22271, which places the cache a bit earlier in the call tree) would require some more work, not a simple rebase (and I don't know if timings would show a large further improvement).

@tacaswell tacaswell removed this from the v3.7.0 milestone Dec 16, 2022
@tacaswell tacaswell added this to the v3.8.0 milestone Dec 16, 2022
@tacaswell
Copy link
Member

Moved to 3.8 as the rebase is non trivial and there have been other changes to this caching layer so we need to evaluate if we still need this.

@ksunden ksunden modified the milestones: v3.8.0, future releases Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: slow rendering of multiple axes (time scales as 2nd power of label count)
6 participants