-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Improve matplotlib.pyplot importtime by caching ArtistInspector #23686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for opening your first PR into Matplotlib!
If you have not heard from us in a while, please feel free to ping @matplotlib/developers
or anyone who has commented on the PR. Most of our reviewers are volunteers and sometimes things fall through the cracks.
You can also join us on gitter for real-time discussion.
For details on testing, writing docs, and our review process, please see the developer guide
We strive to be a welcoming and open project. Please follow our Code of Conduct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't checked the timing myself, but trusting your numbers this would be an awesome improvement!
f7d73b7
to
2342fa4
Compare
@timhoffm I had to refactor the PR a bit since the number of attributes of some Artist classes change over their lifetimes. The numbers changed a bit (they also depend on the temperature in my room), but there is still a significant improvement. The |
Note also #22749 which should make docstring parsing obsolete eventually. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a weak preference to leave these as self.xxx
rather than hard-coding the class and trusting Python to do the fallback correctly (unless that kills the performance gains).
I just checked, and making |
I agree the methods should stay as static methods, but they should be accessed via In [1]: class Test:
...: @staticmethod
...: def foo():
...: print('Called!')
...:
...: def __init__(self):
...: self.foo()
...:
...: t = Test()
Called! This way if any sub-classes want to over-ride this (are overriding this) or we want to re-factor the class (maybe pull a base class out of the top?) the usage won't change. |
@eendebakpt Do you feel comfortable squashing this to one commit? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be squashed or squash-merged.
If you want I can squash the comments and do a force push. But usually a squash-merge is preferred, so the full history remains intact. |
What do you mean? If it is squash-merged there is still only one commit in the history. By default it will just concatenate all of the commit messages, but you can do the same with squashing + force push. |
I mean the history on this PR. For the main branch there will indeed only be one commit in either case. |
Fair enough. I tend to view that as a bad thing because it makes it harder to tell if the branch has actually been merged in or not. Either way, I think this is good to go once the linting issue is fixed! |
57c393d
to
8719715
Compare
Are we doing a 3.6 RC2 and if so could this still get in there? Or do we have to delay to 3.7 to be on the safe side? |
I think it would be fine to pull this back for 3.6rc2. |
… caching ArtistInspector
@eendebakpt thanks and congratulations on your first contribution to Matplotlib! We hope to see you back! We’re about to release version 3.6, so this will be publically available in short. 🚀 |
…686-on-v3.6.x Backport PR #23686 on branch v3.6.x (Improve matplotlib.pyplot importtime by caching ArtistInspector)
PR Summary
During import of matplotlib.pyplot the
ArtistInspector
performs actions for objects of typeArtist
. Several actions are performed twice and can be avoided by caching the result.This PR improves the import time from by 1149 ms to 808 ms (the numbers are system dependent, quoted numbers are on Windows with python 3.10).
The PR consists of 3 elements:
lru_cache
toArtistInspector.is_alias
.ArtistInspector.number_of_parameters(
forlen(inspect.signature(func).parameters)
getdoc_pure
that is equal toinspect.getdoc
, but without an expensive call toinspect.cleandoc
.Notes:
This part of the code was identified by using https://github.com/asottile/importtime-waterfall and https://jiffyclub.github.io/snakeviz/. Analysis of the data shows there are several other places where the import time can be improved, but that is for another PR.
The usage of caching adds some memory overhead, but not a lot. This can be inspected for this PR by
Timings are obtained using
python -c "import time; t0=time.perf_counter(); import matplotlib.pyplot; dt=time.perf_counter()-t0; print(f'{1e3*dt:.2f} ms')"
The method
ArtistInspector.get_valid_values
is still a quite expensive method that is called during startup. The full docstring is used there, so it could not easily be refactored.PR Checklist
Tests and Styling
pytest
passes).flake8-docstrings
and runflake8 --docstring-convention=all
).Documentation
doc/users/next_whats_new/
(follow instructions in README.rst there). No new feature