Skip to content

ENH: switch mpl_toolkits to implicit namespace package (PEP 420) #25381

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 8, 2023

Conversation

neutrinoceros
Copy link
Contributor

@neutrinoceros neutrinoceros commented Mar 4, 2023

PR Summary

closes #25244

PR Checklist

Documentation and Tests

  • Has pytest style unit tests (and pytest passes)
  • Documentation is sphinx and numpydoc compliant (the docs should build without error).
  • [N/A] New plotting related features are documented with examples.

Release Notes

  • [N/A] New features are marked with a .. versionadded:: directive in the docstring and documented in doc/users/next_whats_new/
  • [N/A] API changes are marked with a .. versionchanged:: directive in the docstring and documented in doc/api/next_api_changes/
  • Release notes conform with instructions in next_whats_new/README.rst or next_api_changes/README.rst

@neutrinoceros neutrinoceros changed the title ENH: switch mpl_toolkit to implicit namespace package (PEP 420) ENH: switch mpl_toolkits to implicit namespace package (PEP 420) Mar 4, 2023
@neutrinoceros neutrinoceros marked this pull request as ready for review March 4, 2023 14:28
setup.py Outdated
@@ -301,7 +301,6 @@ def make_release_tree(self, base_dir, files):

package_dir={"": "lib"},
packages=find_packages("lib"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be changed to find_namespace_packages?

The docs for setuptools indicate that packages without __init__ will be omitted, and as mpl_toolkits no longer has a __init__ I suspect it falls into that category

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I missed that ! I just updated the PR to include this. I don't have time right now to verify that it's safe to leave include and exclude empty, but I can spend more time on it tomorrow if needed

@ksunden ksunden added the CI: Run cibuildwheel Run wheel building tests on a PR label Mar 6, 2023
@ksunden
Copy link
Member

ksunden commented Mar 6, 2023

Building the wheels as this change affects packaging

@neutrinoceros
Copy link
Contributor Author

I ran inspect-wheel to see the before/after diff for a given wheel. Currently there are a couple errors to be fixed:

Missing

<             "matplotlib.tight_bbox",
<             "matplotlib.tight_layout",

Wrongly included

>             "matplotlib.tests.tinypages.conf",
>             "matplotlib.tests.tinypages.range4",
>             "matplotlib.tests.tinypages.range6",

@neutrinoceros neutrinoceros marked this pull request as draft March 7, 2023 17:01
@neutrinoceros
Copy link
Contributor Author

I realized that matplotlib.tight_bbox and matplotlib.tight_layout are actually not on the dev branch any more, so I will revert my (doomed) include. On the other hand I havent found a reference run of the "wheels" workflow for the 3.8 (dev) branch yet, so I'm not sure what to do about matplotlib.tests.tinypages.*. My guess is that it should still be excluded (this directory doesn't have a __init__.py

@neutrinoceros neutrinoceros marked this pull request as ready for review March 7, 2023 19:08
@tacaswell
Copy link
Member

https://github.com/matplotlib/matplotlib/actions/runs/4348842943 is an example run of the build-wheels on main (to save CI time we mostly do not build the wheels on every PR commit, but do build them on every merge to main and on the PRs we have the tag on).

👍 that the tinypages packaging should be excluded, that is something that exists purely for testing that the plot sphinx extension can read out parts of a .py file.

@neutrinoceros
Copy link
Contributor Author

thanks !
It seems that excluding matplotlib.tests.tinypages is trickier than expected because of how setuptools.find_namespace_packages work. Quoting from its docstring:

'exclude' is a sequence of names to exclude; '*' can be used
as a wildcard in the names.
When finding packages, 'foo.*' will exclude all subpackages of 'foo'
(but not 'foo' itself).

effectively this means that, in order to exclude matplotlib.tests.tinypages, would need to specify exclude=["matplotlib.tests.*"] and then manually re-include all subpackages that are supposed to be included. I don't mind doing that, but is this worth the maintenance cost ?

@tacaswell
Copy link
Member

I think we only hove one namespace package (mpl_toolkits) and I am very 👎🏻 on adding any more. Maybe we should just hard-code the one we have and leave it at that?

@neutrinoceros
Copy link
Contributor Author

That would be my preference too. Unfortunately using

    packages=find_packages("lib"),
    namespace_packages=["mpl_toolkits"],

just like on the main branch, doesn't work:

$ python -m build
* Creating venv isolated environment...
* Installing packages in isolated environment... (certifi>=2020.06.20, oldest-supported-numpy, pybind11>=2.6, setuptools_scm>=7)
* Getting build dependencies for sdist...

Edit mplsetup.cfg to change the build options; suppress output with --quiet.

BUILDING MATPLOTLIB
      python: yes [3.11.2 (main, Feb 19 2023, 11:54:34) [Clang 14.0.0
                  (clang-1400.0.29.202)]]
    platform: yes [darwin]
       tests: no  [skipping due to configuration]
      macosx: yes [installing]

error in matplotlib setup command: Distribution contains no modules or packages for namespace package 'mpl_toolkits'

ERROR Backend subprocess exited when trying to invoke get_requires_for_build_sdist

I'm tended to think that's a shortcoming of setuptools. If anyone agrees with me I'll report upstream.

@ksunden
Copy link
Member

ksunden commented Mar 8, 2023

Any reason we can't just do find_packages("lib") + ["mpl_toolkits"]

That adds the one implicit namespace package we want while ignoring the edgecase behavior of excludes.

I mean, find_packages is just a list of strings convenience method anyway... if the tuning options are not what we want, we can override them pretty easily.

I may suggest a comment to indicate that it is namespace package related/that get_namespace_packages doesn't do what we want. Such that future us can at least be reminded of it should we ever decide to add a namespace package again (somewhat doubt, I'd tend to reach for entry points for a branch of similar ideas these days, but could happen)

@tacaswell
Copy link
Member

There is now discussion going on upstream with projects (e.g. zope and plone) who use this orders of magnitude more than we so it might be worth seeing how that shakes out.

@neutrinoceros
Copy link
Contributor Author

@tacaswell can you link those discussions here ?

@tacaswell
Copy link
Member

pypa/setuptools#3434

@ksunden
Copy link
Member

ksunden commented May 25, 2023

Okay, I've been digging into this (with the help of @henryiii)

There are some... odd things happening through combinations of tools we use.

It's not actually a problem if one source is missing the __init__.py, the problems come in when multiple sources have one, but they are not byte-per-byte identical. So I don't think we have to worry too much about removing the file causing compatibility problems.

Even on current main/releases, the mpl_tookits/__init__.py file does not appear in the wheels (it does in sdist, still)

Except on windows, where it is not the same version of the file, it is instead added by delvewheel:

Delvewheel init.py contents
""""""# start delvewheel patch
def _delvewheel_init_patch_1_3_3():
    import ctypes
    import os
    import platform
    import sys
    libs_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), os.pardir, 'matplotlib.libs'))
    is_pyinstaller = getattr(sys, 'frozen', False) and hasattr(sys, '_MEIPASS')
    is_conda_cpython = platform.python_implementation() == 'CPython' and (hasattr(ctypes.pythonapi, 'Anaconda_GetVersion') or 'packaged by conda-forge' in sys.version)
    if sys.version_info[:2] >= (3, 8) and not is_conda_cpython or sys.version_info[:2] >= (3, 10):
        if not is_pyinstaller or os.path.isdir(libs_dir):
            os.add_dll_directory(libs_dir)
    else:
        load_order_filepath = os.path.join(libs_dir, '.load-order-matplotlib-3.7.1')
        if not is_pyinstaller or os.path.isfile(load_order_filepath):
            with open(os.path.join(libs_dir, '.load-order-matplotlib-3.7.1')) as file:
                load_order = file.read().split()
            for lib in load_order:
                lib_path = os.path.join(os.path.join(libs_dir, lib))
                if (not is_pyinstaller or os.path.isfile(lib_path)) and not ctypes.windll.kernel32.LoadLibraryExW(ctypes.c_wchar_p(lib_path), None, 0x00000008):
                    raise OSError('Error loading {}; {}'.format(lib, ctypes.FormatError()))


_delvewheel_init_patch_1_3_3()
del _delvewheel_init_patch_1_3_3
# end delvewheel patch

By delvewheel's own docs, this behavior is problematic for namespace packages:

delvewheel creates or patches __init__.py in each top-level package so that the DLLs are loaded properly during import. This will cause issues if you have a top-level namespace package that requires __init__.py to be absent to function properly.

As, at least I'm pretty sure, we do not actually import any dlls directly within mpl-toolkits, perhaps we can configure delvewheel to ignore it? or lobby for better handling of namespace packages by upstream?

So, deleting the __init__.py and adjusting the discovered packages should be at least as good as status quo, but potential problems (that already exist) may persist for windows, especially.

The exclude is shell glob syntax, so it is actually pretty easy to specify precisely what we want:

diff --git a/setup.py b/setup.py
index 6d96e6b86c..4f43b2c425 100644
--- a/setup.py
+++ b/setup.py
@@ -29,7 +29,7 @@ from pathlib import Path
 import shutil
 import subprocess
 
-from setuptools import setup, find_packages, Distribution, Extension
+from setuptools import setup, find_namespace_packages, Distribution, Extension
 import setuptools.command.build_ext
 import setuptools.command.build_py
 import setuptools.command.sdist
@@ -301,7 +301,7 @@ setup(  # Finally, pass this all along to setuptools to do the heavy lifting.
     ],
 
     package_dir={"": "lib"},
-    packages=find_packages("lib"),
+    packages=find_namespace_packages("lib", exclude=["*baseline_images*", "*tinypages*", "*mpl-data*", "*web_backend*"]),
     namespace_packages=["mpl_toolkits"],
     py_modules=["pylab"],
     # Dummy extension to trigger build_ext, which will swap it out with

This actually does not change the wheels that are produced at all (but removes the __init__.py from the sdist)

removing the namespace_packages=["mpl_toolkits"], from the following line will remove the .pth file and namespace_packages.txt from the wheels, which should still work, though may have some particularly prickly edge cases (that I haven't tested) surrounding things like installing in multiple locations (e.g. one installed with --user)

Ultimately, I don't even think there are that many downstreams using this feature, and at least one of them (basemap) is something we can control. The only other one I could find (at least via github search) is pygae/mpl_toolkits.clifford#10, which has not been touched in several years, and has not replied to @oscargus's issue opened in February.

I greatly expect that if something was going to fail for users, it would have already happened, given the current state of the wheels. My leaning is to push forward down this path.

@neutrinoceros
Copy link
Contributor Author

Thank you so much @ksunden ! I've updated the PR with your suggestion.

@@ -301,8 +301,10 @@ def make_release_tree(self, base_dir, files):
],

package_dir={"": "lib"},
packages=find_packages("lib"),
namespace_packages=["mpl_toolkits"],
packages=find_namespace_packages(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I read https://setuptools.pypa.io/en/latest/userguide/package_discovery.html#finding-simple-packages correctly, but could we use find_packages(…, include=..)? The exclude has the danger of unintentionally picking up other files we add in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I experimented a bit with it and so far I haven't been able to build matplotlib by supplying include instead of exclude. Even building an exhaustive list of all modules that should be included doesn't appear to work. I don't know if that's a bug in setuptools or if there's something I don't understand about this function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the key difference between include and exclude are a) liklihood of adding things that we want to go in each list. and b) ease of drawing the lines around the things we want included/excluded.

for a), I think it is more likely that we add python modules than folders we wish to exclude from being importable python modules (both could happen, but I'd expect python modules to be more common), so excludes are less likely to be updated

for b) I think that the things excluded appear directly next to things that want to be included, so I think it is much easier to specify exclusions.

I mean, it is just a list of strings at the end of the day, so if there is an easier way to produce the list we want, I'm not opposed, but lean towards using the tools provided to produce the lists.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will change under meson so lets not over think this.

Copy link
Member

@tacaswell tacaswell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not make the windows situation worse and brings us inline with upstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Run cibuildwheel Run wheel building tests on a PR topic: mpl_toolkit
Projects
Development

Successfully merging this pull request may close these issues.

[Bug]: DeprecationWarning for pkg_resources.declare_namespace usage in mpl_toolkit
7 participants