Skip to content

Improve matplotlib.pyplot importtime by caching ArtistInspector #23686

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 27, 2022

Conversation

eendebakpt
Copy link
Contributor

@eendebakpt eendebakpt commented Aug 20, 2022

PR Summary

During import of matplotlib.pyplot the ArtistInspector performs actions for objects of type Artist. Several actions are performed twice and can be avoided by caching the result.

This PR improves the import time from by 1149 ms to 808 ms (the numbers are system dependent, quoted numbers are on Windows with python 3.10).

The PR consists of 3 elements:

  • Adding a lru_cache to ArtistInspector.is_alias.
  • Creating a cached method ArtistInspector.number_of_parameters( for len(inspect.signature(func).parameters)
  • Creating a method getdoc_pure that is equal to inspect.getdoc, but without an expensive call to inspect.cleandoc.

Notes:

This part of the code was identified by using https://github.com/asottile/importtime-waterfall and https://jiffyclub.github.io/snakeviz/. Analysis of the data shows there are several other places where the import time can be improved, but that is for another PR.

The usage of caching adds some memory overhead, but not a lot. This can be inspected for this PR by

import matplotlib.artist
c=matplotlib.artist.ArtistInspector.is_alias
c.cache_info()

Timings are obtained using python -c "import time; t0=time.perf_counter(); import matplotlib.pyplot; dt=time.perf_counter()-t0; print(f'{1e3*dt:.2f} ms')"

The method ArtistInspector.get_valid_values is still a quite expensive method that is called during startup. The full docstring is used there, so it could not easily be refactored.

PR Checklist

Tests and Styling

  • Has pytest style unit tests (and pytest passes).
  • Is Flake 8 compliant (install flake8-docstrings and run flake8 --docstring-convention=all).

Documentation

  • New features have an entry in doc/users/next_whats_new/ (follow instructions in README.rst there). No new feature

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for opening your first PR into Matplotlib!

If you have not heard from us in a while, please feel free to ping @matplotlib/developers or anyone who has commented on the PR. Most of our reviewers are volunteers and sometimes things fall through the cracks.

You can also join us on gitter for real-time discussion.

For details on testing, writing docs, and our review process, please see the developer guide

We strive to be a welcoming and open project. Please follow our Code of Conduct.

Copy link
Member

@timhoffm timhoffm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't checked the timing myself, but trusting your numbers this would be an awesome improvement!

@eendebakpt eendebakpt force-pushed the importtime_artist branch 2 times, most recently from f7d73b7 to 2342fa4 Compare August 21, 2022 20:35
@eendebakpt
Copy link
Contributor Author

I haven't checked the timing myself, but trusting your numbers this would be an awesome improvement!

@timhoffm I had to refactor the PR a bit since the number of attributes of some Artist classes change over their lifetimes.

The numbers changed a bit (they also depend on the temperature in my room), but there is still a significant improvement.

The inspect.getdoc is still one of the move expensive operations.

@timhoffm
Copy link
Member

Note also #22749 which should make docstring parsing obsolete eventually.

QuLogic
QuLogic previously approved these changes Aug 23, 2022
@QuLogic QuLogic dismissed their stale review August 23, 2022 21:51

Didn't mean to approve yet

@eendebakpt eendebakpt requested a review from QuLogic August 24, 2022 21:01
@eendebakpt eendebakpt changed the title Draft: improve matplotlib.pyplot importtime by caching ArtistInspector Improve matplotlib.pyplot importtime by caching ArtistInspector Aug 24, 2022
@tacaswell tacaswell added this to the v3.7.0 milestone Aug 25, 2022
Copy link
Member

@tacaswell tacaswell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a weak preference to leave these as self.xxx rather than hard-coding the class and trusting Python to do the fallback correctly (unless that kills the performance gains).

@eendebakpt
Copy link
Contributor Author

I have a weak preference to leave these as self.xxx rather than hard-coding the class and trusting Python to do the fallback correctly (unless that kills the performance gains).

I just checked, and making is_alias a normal method instead of a class method hurts performance because then for each new instance of the ArtistInspector the cache is empty.

@tacaswell
Copy link
Member

I agree the methods should stay as static methods, but they should be accessed via self rather than via the class object

In [1]: class Test:
   ...:     @staticmethod
   ...:     def foo():
   ...:         print('Called!')
   ...:
   ...:     def __init__(self):
   ...:         self.foo()
   ...:
   ...: t = Test()
Called!

This way if any sub-classes want to over-ride this (are overriding this) or we want to re-factor the class (maybe pull a base class out of the top?) the usage won't change.

@tacaswell
Copy link
Member

@eendebakpt Do you feel comfortable squashing this to one commit?

Copy link
Member

@tacaswell tacaswell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be squashed or squash-merged.

@eendebakpt
Copy link
Contributor Author

Should be squashed or squash-merged.

If you want I can squash the comments and do a force push. But usually a squash-merge is preferred, so the full history remains intact.

@tacaswell
Copy link
Member

But usually a squash-merge is preferred, so the full history remains intact.

What do you mean? If it is squash-merged there is still only one commit in the history. By default it will just concatenate all of the commit messages, but you can do the same with squashing + force push.

@eendebakpt
Copy link
Contributor Author

But usually a squash-merge is preferred, so the full history remains intact.

What do you mean? If it is squash-merged there is still only one commit in the history. By default it will just concatenate all of the commit messages, but you can do the same with squashing + force push.

I mean the history on this PR. For the main branch there will indeed only be one commit in either case.

@tacaswell
Copy link
Member

Fair enough. I tend to view that as a bad thing because it makes it harder to tell if the branch has actually been merged in or not.

Either way, I think this is good to go once the linting issue is fixed!

@timhoffm
Copy link
Member

Are we doing a 3.6 RC2 and if so could this still get in there? Or do we have to delay to 3.7 to be on the safe side?

@tacaswell
Copy link
Member

I think it would be fine to pull this back for 3.6rc2.

meeseeksmachine pushed a commit to meeseeksmachine/matplotlib that referenced this pull request Aug 27, 2022
@timhoffm
Copy link
Member

@eendebakpt thanks and congratulations on your first contribution to Matplotlib!

We hope to see you back!

We’re about to release version 3.6, so this will be publically available in short. 🚀

timhoffm added a commit that referenced this pull request Aug 27, 2022
…686-on-v3.6.x

Backport PR #23686 on branch v3.6.x (Improve matplotlib.pyplot importtime by caching ArtistInspector)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants