Skip to content

Lazy import of private modules #12781

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 27, 2019
Merged

Conversation

mdboom
Copy link
Member

@mdboom mdboom commented Nov 8, 2018

PR Summary

Not all platforms have blazing fast dynamic linkers, therefore a lot of import (startup) time can be saved by not importing extension modules that we don't have to.

There are some more aggressive uses of lazy loading that are possible, but this is just the stuff that's already declared private and thus users who are importing these things at this path have had "fair warning". If acceptable, I can provide a follow-on PR that deprecates and then removes additional things.

PR Checklist

  • Has Pytest style unit tests
  • Code is Flake 8 compliant
  • New features are documented, with examples if plot related
  • Documentation is sphinx and numpydoc compliant
  • Added an entry to doc/users/next_whats_new/ if major new feature (follow instructions in README.rst there)
  • Documented in doc/api/api_changes.rst if API changed in a backward-incompatible way

Copy link
Member

@tacaswell tacaswell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't normally like local imports, but this seems like a compelling argument and is much simpler than using a lazy proxy.

@anntzer
Copy link
Contributor

anntzer commented Nov 8, 2018

Just importing matplotlib.figure (which is necessary if you want to do anything with matplotlib, I'm not even talking about pyplot) already imports these extension modules, so it's not clear what actual use cases would get accelerated?

$ MPLCONFIGDIR=/tmp/mpld MPLBACKEND= python  # make sure we have a default config
Python 3.7.1 (default, Oct 22 2018, 10:41:28) 
[GCC 8.2.1 20180831] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import matplotlib.figure, sys
>>> [m for m in sys.modules if m.startswith("matplotlib._")]
['matplotlib._color_data', 'matplotlib._version', 'matplotlib._pylab_helpers', 'matplotlib._mathtext_data', 'matplotlib._png', 'matplotlib._path', 'matplotlib._cm', 'matplotlib._cm_listed', 'matplotlib._image', 'matplotlib._contour', 'matplotlib._layoutbox', 'matplotlib._constrained_layout', 'matplotlib._tri', 'matplotlib._qhull']

@ImportanceOfBeingErnest
Copy link
Member

Out of interest, do you have a number for how much time this will save?

@QuLogic
Copy link
Member

QuLogic commented Nov 8, 2018

It's not really easy to read the whole thing, but the flamegraph in #11546 doesn't seem to indicate that these imports are particularly significant, except maybe the urllib one you have downstream. Importing NumPy seems to dominate by a fair fair bit, unless you've already tackled that one?

anntzer
anntzer previously requested changes Nov 9, 2018
Copy link
Contributor

@anntzer anntzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per #12781 (comment) I think if we're going that route there should be at least a test showing that importing matplotlib.figure and matplotlib.backend_bases (in a separate subprocess) does not lead to any of these extension modules being loaded.
Otherwise it's too easy to break that property...

@mdboom
Copy link
Member Author

mdboom commented Nov 9, 2018

@anntzer wrote:

Just importing matplotlib.figure (which is necessary if you want to do anything with matplotlib, I'm not even talking about pyplot) already imports these extension modules,

Have you tried it with this PR though?

@ImportanceOfBeingErnest wrote:

Out of interest, do you have a number for how much time this will save?

When importing matplotlib in pyodide, the import time goes from 5.9s to 5.2s (when added with a few others that involve non-underscore-prefixed-but-probably-not-intentionally-public things). This is largely due to the slowness of dynamically linking extension modules on the WebAssembly platform. I would expect the delta on a native platform to be much smaller. (See pyodide/pyodide#251).

@QuLogic wrote:

It's not really easy to read the whole thing, but the flamegraph in #11546 doesn't seem to indicate that these imports are particularly significant

Things change when the cost of importing extension modules is higher. Particularly the _qhull and _tri modules stand out more in that context.

image

Raw data: https://gist.github.com/mdboom/3d5d86e217f8f650f33a85a39cd835cc

But, yes, I would be a big win to not import urllib as well.

I haven't tackled numpy, but I would consider that something that probably needs to be dealt with in numpy. Lazy importing all of numpy for matplotlib seems unfeasible.

@mdboom
Copy link
Member Author

mdboom commented Nov 9, 2018

@anntzer: I've added a test here to ensure this "sticks". It can easily be extended in the future if we want to "blacklist" more modules for lazy loading.

This also gave me the opportunity to measure on a native build. As expected, it's pretty minor: 251424 vs 281676 us.

@mdboom mdboom force-pushed the reduce-imports branch 2 times, most recently from cae5a92 to b02a8ec Compare November 9, 2018 03:04
@anntzer
Copy link
Contributor

anntzer commented Nov 9, 2018

Have you tried it with this PR though?

Duh, sorry :)

Thanks for the test.

@mdboom
Copy link
Member Author

mdboom commented Nov 9, 2018

Would also lazy loading urllib also be acceptable? That's the next piece of low-hanging fruit. It's not underscore-prefixed, but hopefully no one is doing from matplotlib.cbook import urllib. (Famous last words, I know).

@anntzer
Copy link
Contributor

anntzer commented Nov 12, 2018

I did try inlining urllib imports in #11546 (comment) and actually saw no improvements (despite what the flamegraph says).
If you have numbers that do show improvements with urllib in your case, making the import lazy sounds fine to me.

@mdboom
Copy link
Member Author

mdboom commented Nov 26, 2018

This is passing CI and ready to merge.

@timhoffm
Copy link
Member

timhoffm commented Dec 1, 2018

I can confirm that urllib.request not imported anymore with a standard import matplotlib in this PR.

@tacaswell
Copy link
Member

@mdboom Took the liberty of rebasing and force-pushing, hope you do not mind.

@jklymak
Copy link
Member

jklymak commented Feb 26, 2019

This has some real test errors. Presumably missing imports?

@tacaswell
Copy link
Member

Looks like I missed one on the rebase, fixed now.

@jklymak jklymak merged commit 67dd5d3 into matplotlib:master Feb 27, 2019
meeseeksmachine pushed a commit to meeseeksmachine/matplotlib that referenced this pull request Feb 27, 2019
jklymak added a commit that referenced this pull request Feb 27, 2019
…781-on-v3.1.x

Backport PR #12781 on branch v3.1.x (Lazy import of private modules)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants