Skip to content

Exclude unused libpython{python_version}.so to reduce the size of zipped Python executables #772

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 10, 2022

Conversation

tetsuok
Copy link
Contributor

@tetsuok tetsuok commented Jul 29, 2022

PR Checklist

Please check if your PR fulfills the following requirements:

  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)

PR Type

What kind of change does this PR introduce?

  • Bugfix
  • Feature (please, look at the "Scope of the project" section in the README.md file)
  • Code style update (formatting, local variables)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • CI related changes
  • Documentation content changes
  • Other... Please describe:

What is the current behavior?

Issue Number: N/A

#758 reduced the size of artifacts built by the bazel build command with the --build_python_zip option, but there is still room for improvement. The size of the artifacts is at least 43MB (e.g., a zipped python binary which just prints "hello, world"). This is still not great when build artifacts are included to docker images. It turned out that the --build_python_zip option includes the two identical shared libraries (not symlinks), libpython{python_version}.so and libpython{python_version}.so.1.0 into the zip files, but rules_python doesn't use libpython{python_version}.so.
libpython{python_version}.so occupies 16MB (37% of the total size of the artifact of size 43MB). By removing the unused shared library, users of rules_python can deploy smaller Python binaries.

Repro steps

WORKSPACE:

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

commit = "9cdb4f3f6aded1ccc62a10d004f9927ccc72702f"

http_archive(
    name = "rules_python",
    sha256 = "7d048c530ca907013565fee907d358314c4cb329622284e07c50be76f2d961ee",
    strip_prefix = "rules_python-" + commit,
    url = "https://github.com/tetsuok/rules_python/archive/" + commit + ".tar.gz",
)

load("@rules_python//python:repositories.bzl", "python_register_toolchains")

python_register_toolchains(
    name = "python3_10",
    python_version = "3.10.4",
)

BUILD:

load("@rules_python//python:defs.bzl", "py_binary")

py_binary(
    name = "hello",
    srcs = ["hello.py"],
)

hello.py:

print("hello, world!")
$ bazel build --build_python_zip //:hello
$ ls -lh bazel-bin/hello.zip
-r-xr-xr-x 1 t docker 43M Jul 29 00:15 bazel-bin/hello.zip

You can see two identical shared libraries, libpython{python_version}.so and libpython{python_version}.so.1.0 into zip files, and the shared libraries dominate the total size of zip files:

$ unzip -l bazel-bin/hello.zip | sort -n | tail -n 6
   745004  2010-01-01 00:00   runfiles/python3_10_x86_64-unknown-linux-gnu/lib/python3.10/pydoc_data/topics.py
   816725  2010-01-01 00:00   runfiles/python3_10_x86_64-unknown-linux-gnu/lib/python3.10/ensurepip/_bundled/setuptools-58.1.0-py3-none-any.whl
  2123599  2010-01-01 00:00   runfiles/python3_10_x86_64-unknown-linux-gnu/lib/python3.10/ensurepip/_bundled/pip-22.0.4-py3-none-any.whl
 38861696  2010-01-01 00:00   runfiles/python3_10_x86_64-unknown-linux-gnu/lib/libpython3.10.so
 38861696  2010-01-01 00:00   runfiles/python3_10_x86_64-unknown-linux-gnu/lib/libpython3.10.so.1.0
110513272                     2381 files

What is the new behavior?

The unused shared library, libpython{python_version}.so included in the hermetic Python toolchain is excluded from Python runfiles built by Python rules. Artifacts built by the bazel build command with the --build_python_zip get smaller. The size of the resulting zipped Python executables reduces by 16 MB (43 MB → 27 MB in the above example).

Does this PR introduce a breaking change?

  • Yes
  • No

Other information

@tetsuok tetsuok requested review from brandjon and lberki as code owners July 29, 2022 01:16
@mattem mattem force-pushed the exclude-unused-libpython-so branch from 0a48a4f to cba3678 Compare August 10, 2022 14:46
@mattem mattem merged commit e99bd61 into bazel-contrib:main Aug 10, 2022
@tetsuok tetsuok deleted the exclude-unused-libpython-so branch August 10, 2022 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants