Skip to content

Add standard Python source distribution generation to (pre-)release workflow #152098

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 17 commits into from

Conversation

zklaus
Copy link
Collaborator

@zklaus zklaus commented Apr 24, 2025

Generate source tarball with PEP 517 conform build tools instead of the custom routine in place right now.

Closes #150461.

The current procedure for generating the source tarball consists in creation of a source tree by manual copying and pruning of source files.

This PR replaces that with a call to the standard build tool, which works with the build backend to produce an sdist. For that to work correctly, the build backend also needs to be configured. In the case of Pytorch, the backend currently is (the legacy version of) the setuptools backend, the source dist part of which is mostly configured via the MANIFEST.in file.

The resulting source distribution can be used to install directly from source with pip install ./torch-{version}.tar.gz or to build wheels directly from source with pip wheel ./torch-{version}.tar.gz; both should be considered experimental for now.

Issues

sdist name

According to PEP 517, the name of the source distribution file must coincide with the project name, or more precisely, the source distribution of a project that generates {NAME}-{...}.whl wheels are required to be named {NAME}-{...}.tar.gz. Currently, the source tarball is called pytorch-{...}.tar.gz, but the generated wheels and python package are called torch-{...}.

Symbolic Links

The source tree at the moment contains a small number of symbolic links. This has been seen as problematic largely because of lack of support on Windows, but also because of a problem in setuptools. Particularly unfortunate is a circular symlink in the third party ittapi module, which can not be resolved by replacing it with a copy.

PEP 721 (now integrated in the Source Distribution Format Specification) allows for symbolic links, but only if they don't point outside the destination directory and if they don't contain ../ in their target.

The list of symbolic links currently is as follows:

source target problem solution
.dockerignore .gitignore ✅ ok (individual file)
docs/requirements.txt ../.ci/docker/requirements-docs.txt .. in target swap source and target1
functorch/docs/source/notebooks ../../notebooks/ .. in target swap source and target1
.github/ci_commit_pins/triton.txt ../../.ci/docker/ci_commit_pins/triton.txt ✅ ok (omitted from sdist)
third_party/flatbuffers/docs/source/CONTRIBUTING.md ../../CONTRIBUTING.md .. in target omit from sdist2
third_party/flatbuffers/java/src/test/java/DictionaryLookup ../../../../tests/DictionaryLookup .. in target omit from sdist3
third_party/flatbuffers/java/src/test/java/MyGame ../../../../tests/MyGame .. in target omit from sdist3
third_party/flatbuffers/java/src/test/java/NamespaceA ../../../../tests/namespace_test/NamespaceA .. in target omit from sdist3
third_party/flatbuffers/java/src/test/java/NamespaceC ../../../../tests/namespace_test/NamespaceC .. in target omit from sdist3
third_party/flatbuffers/java/src/test/java/optional_scalars ../../../../tests/optional_scalars .. in target omit from sdist3
third_party/flatbuffers/java/src/test/java/union_vector ../../../../tests/union_vector .. in target omit from sdist3
third_party/flatbuffers/kotlin/benchmark/src/jvmMain/java ../../../../java/src/main/java .. in target omit from sdist3
third_party/ittapi/rust/ittapi-sys/c-library ../../ .. in target omit from sdist4
third_party/ittapi/rust/ittapi-sys/LICENSES ../../LICENSES .. in target omit from sdist4
third_party/opentelemetry-cpp/buildscripts/pre-merge-commit ./pre-commit ✅ ok (individual file)
third_party/opentelemetry-cpp/third_party/prometheus-cpp/cmake/project-import-cmake/sample_client.cc ../../push/tests/integration/sample_client.cc .. in target omit from sdist5
third_party/opentelemetry-cpp/third_party/prometheus-cpp/cmake/project-import-cmake/sample_server.cc ../../pull/tests/integration/sample_server.cc .. in target omit from sdist5
third_party/opentelemetry-cpp/third_party/prometheus-cpp/cmake/project-import-pkgconfig/sample_client.cc ../../push/tests/integration/sample_client.cc .. in target omit from sdist5
third_party/opentelemetry-cpp/third_party/prometheus-cpp/cmake/project-import-pkgconfig/sample_server.cc ../../pull/tests/integration/sample_server.cc .. in target omit from sdist5
third_party/XNNPACK/tools/xngen xngen.py ✅ ok (individual file)

The introduction of symbolic links inside the .ci/docker folder creates a new problem, however, because Docker's COPY command does not allow symlinks in this way. We work around that by using tar ch to dereference the symlinks before handing them over to docker build.

Nccl

Nccl used to be included as a submodule. However, with #146073 (first released in v2.7.0-rc1), the submodule was removed and replaced with a build time checkout procedure in tools/build_pytorch_libs.py, which checks out the required version of nccl from the upstream repository based on a commit pin recorded in .ci/docker/ci_commit_pins/nccl-cu{11,12}.txt.
This means that a crucial third party dependency is missing from the source distribution and as the .ci folder is omitted from the source distribution, it is not possible to use the build time download.
However, it is possible to use a system provided Nccl using the USE_SYSTEM_NCCL environment variable, which now also is the default for the official Pytorch wheels.

Stack from ghstack (oldest at bottom):

Footnotes

  1. These resources can be naturally considered to be part of the docs, so moving the actual files into the place of the current symlinks and replacing them with (unproblematic) symlinks can be said to improve semantics as well. 2

  2. The flatbuffers docs already actually use the original file, not the symlink and in the most recent releases, starting from flatbuffers-25.1.21 the symlink is replaced by the actual file thanks to a documentation overhaul.

  3. These resources are flatbuffers tests for java and kotlin and can be omitted from our sdist. 2 3 4 5 6 7

  4. We don't need to ship the rust bindings for ittapi. 2

  5. These are demonstration examples for how to link to prometheus-cpp using cmake and can be omitted. 2 3 4

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Apr 24, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/152098

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit e51f387 with merge base 5dfd8a9 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the release notes: releng release notes category label Apr 24, 2025
@zklaus zklaus marked this pull request as draft April 24, 2025 10:29
[ghstack-poisoned]
zklaus added a commit that referenced this pull request Apr 24, 2025
ghstack-source-id: 7b67212
Pull Request resolved: #152098
@atalman
Copy link
Contributor

atalman commented Apr 28, 2025

Hi @zklaus could you please add pep517 build as additional source code packaging ? Keeping current tar.gz file as is for now.

[ghstack-poisoned]
zklaus added a commit that referenced this pull request Apr 30, 2025
ghstack-source-id: b33c3b6
Pull Request resolved: #152098
[ghstack-poisoned]
zklaus added a commit that referenced this pull request May 1, 2025
ghstack-source-id: a299d4a
Pull Request resolved: #152098
[ghstack-poisoned]
[ghstack-poisoned]
zklaus added a commit that referenced this pull request May 15, 2025
ghstack-source-id: 913b6c3
Pull Request resolved: #152098
[ghstack-poisoned]
zklaus added a commit that referenced this pull request May 16, 2025
ghstack-source-id: c93207d
Pull Request resolved: #152098
[ghstack-poisoned]
zklaus added a commit that referenced this pull request May 19, 2025
ghstack-source-id: 4f4394d
Pull Request resolved: #152098
[ghstack-poisoned]
zklaus added a commit that referenced this pull request Jun 6, 2025
ghstack-source-id: 6404328
Pull Request resolved: #152098
[ghstack-poisoned]
zklaus added a commit that referenced this pull request Jun 9, 2025
ghstack-source-id: 30a49d6
Pull Request resolved: #152098
[ghstack-poisoned]
zklaus added a commit that referenced this pull request Jun 11, 2025
ghstack-source-id: 397662c
Pull Request resolved: #152098
zklaus added a commit that referenced this pull request Jun 30, 2025
ghstack-source-id: d4adacc
Pull Request resolved: #152098
@zklaus
Copy link
Collaborator Author

zklaus commented Jun 30, 2025

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@malfet
Copy link
Contributor

malfet commented Jul 1, 2025

@pytorchbot revert -m "IMO this PR needs to be split into few helper ones, with better test plan" -c weird

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot added a commit that referenced this pull request Jul 1, 2025
This reverts commit f16053f.

Reverted #152098 on behalf of https://github.com/malfet due to IMO this PR needs to be split into few helper ones, with better test plan ([comment](#152098 (comment)))
@pytorchmergebot
Copy link
Collaborator

@zklaus your PR has been successfully reverted.

@pytorchmergebot pytorchmergebot added Reverted ci-no-td Do not run TD on this PR labels Jul 1, 2025
@malfet
Copy link
Contributor

malfet commented Jul 1, 2025

@zklaus what's your test plan for this change?
For example replacing docs/requirements.txt symlink with the file, mean that we know have two sets of requirements: one for CI and another for users who try to build them elsewhere with no script to keep them in sync.

Also, if symlinks are no longer allowed in PyTorch source repo, that requires a wider discussion
(And if I'm not mistaken, ittapi has been udpated a while ago and new version no longer has this problem. If it's still does, than perhaps it's time to remove it.

@malfet malfet dismissed atalman’s stale review July 1, 2025 14:17

Let's have a further discussion on this one

@zklaus
Copy link
Collaborator Author

zklaus commented Jul 2, 2025

For example replacing docs/requirements.txt symlink with the file, mean that we know have two sets of requirements: one for CI and another for users who try to build them elsewhere with no script to keep them in sync.

How do you mean?

Before this change, there was the requirements file .ci/docker/requirements-docs.txt which was symlinked as ../.ci/docker/requirements-docs.txt from docs/requirements.txt since #151796.
In this situation, because .ci is excluded from the source tarball, we end up with a broken symlink, that additionally is invalid in a Python source distribution.

After this change, there is still a single source of truth, which now is docs/requirements.txt, symlinked as ../docs/requirements.txt from .ci/docker/requirements-docs.txt, which would also be invalid in a Python source distribution, but is not included in the tarball (see above). Additionally, the docs requirements that were missing from the previous tarball, are now actually included, allowing users to build the documentation again.

Also, if symlinks are no longer allowed in PyTorch source repo, that requires a wider discussion (And if I'm not mistaken, ittapi has been udpated a while ago and new version no longer has this problem. If it's still does, than perhaps it's time to remove it.

This PR is not intended to forbid or remove all symlinks. It does change some symlinks, which are invalid in Python source distributions and tries to point out in sufficient detail problem and remedy in the PR description. Apologies if the description is unclear.

As for ittapi, it seems that the problem remains, but as it exists only the Rust subfolder of ittapi's source tree, the suggested remedy (as detailed in the PR description), is to not ship the Rust bindings sources for ittapi.

@zklaus what's your test plan for this change?

First, let me point out that this adds a new source artifact without changing the current practice, so that all existing workflows should remain functional. I'm sorry this is was not well represented in the title of the PR (now changed). I have tried to keep the changes beyond this addition minimal and to leave both the Pytorch sources and the (pre-)release tarballs functionally unchanged.
Of course it's possible that I made a mistake, so if you see a change that impacts either the source tree or the "old-style" tarballs, please do let me know!

As for testing, I have compared the file listings of the old and new tarballs and I have verified that Pytorch can be built from the source distribution, both with and without unpacking it. Of course there are many ways to configure a Pytorch build and not every combination can be tested manually.

Seeing as this PR improves the source quality by avoiding one broken and several invalid links in the tarball without otherwise impacting existing workflows, I think it would be good to integrate.

Short-term, I have planned to also add an actual build step to the tarball creation.
Mid-term that may be fleshed out into a matrix of "most important builds".
Long-term, I think it's worth discussing to base all ci builds on a source distribution, to avoid costly git operations and improve reproducability.

@malfet, does this address your concerns?

@zklaus zklaus changed the title Switch to standard pep517 sdist generation Add standard Python source distribution generation to (pre-)release workflow Jul 2, 2025
@zklaus zklaus requested a review from malfet July 2, 2025 15:56
[ghstack-poisoned]
zklaus added a commit that referenced this pull request Jul 2, 2025
ghstack-source-id: c920ff1
Pull Request resolved: #152098
[ghstack-poisoned]
[ghstack-poisoned]
zklaus added a commit that referenced this pull request Jul 3, 2025
ghstack-source-id: 64cedb8
Pull Request resolved: #152098
@zklaus
Copy link
Collaborator Author

zklaus commented Jul 8, 2025

I have split this up into a number of smaller PRs in the stack #157811 - #157815.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-no-td Do not run TD on this PR ciflow/trunk Trigger trunk jobs on your pull request Merged open source release notes: releng release notes category Reverted
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants