Skip to content

Commit f16053f

Browse files
zklauspytorchmergebot
authored andcommitted
Switch to standard pep517 sdist generation (#152098)
Generate source tarball with PEP 517 conform build tools instead of the custom routine in place right now. Closes #150461. The current procedure for generating the source tarball consists in creation of a source tree by manual copying and pruning of source files. This PR replaces that with a call to the standard [build tool](https://build.pypa.io/en/stable/), which works with the build backend to produce an sdist. For that to work correctly, the build backend also needs to be configured. In the case of Pytorch, the backend currently is (the legacy version of) the setuptools backend, the source dist part of which is mostly configured via the `MANIFEST.in` file. The resulting source distribution can be used to install directly from source with `pip install ./torch-{version}.tar.gz` or to build wheels directly from source with `pip wheel ./torch-{version}.tar.gz`; both should be considered experimental for now. ## Issues ### sdist name According to PEP 517, the name of the source distribution file must coincide with the project name, or [more precisely](https://peps.python.org/pep-0517/#source-distributions), the source distribution of a project that generates `{NAME}-{...}.whl` wheels are required to be named `{NAME}-{...}.tar.gz`. Currently, the source tarball is called `pytorch-{...}.tar.gz`, but the generated wheels and python package are called `torch-{...}`. ### Symbolic Links The source tree at the moment contains a small number of symbolic links. This [has been seen as problematic](pypa/pip#5919) largely because of lack of support on Windows, but also because of [a problem in setuptools](pypa/setuptools#4937). Particularly unfortunate is a circular symlink in the third party `ittapi` module, which can not be resolved by replacing it with a copy. PEP 721 (now integrated in the [Source Distribution Format Specification](https://packaging.python.org/en/latest/specifications/source-distribution-format/#source-distribution-archive-features)) allows for symbolic links, but only if they don't point outside the destination directory and if they don't contain `../` in their target. The list of symbolic links currently is as follows: <details> |source|target|problem|solution| |-|-|-|-| | `.dockerignore` | `.gitignore` | ✅ ok (individual file) || | `docs/requirements.txt` | `../.ci/docker/requirements-docs.txt` |❗`..` in target|swap source and target[^1]| | `functorch/docs/source/notebooks` | `../../notebooks/` |❗`..` in target|swap source and target[^1]| | `.github/ci_commit_pins/triton.txt` | `../../.ci/docker/ci_commit_pins/triton.txt` | ✅ ok (omitted from sdist)|| | `third_party/flatbuffers/docs/source/CONTRIBUTING.md` | `../../CONTRIBUTING.md` |❗`..` in target|omit from sdist[^2]| | `third_party/flatbuffers/java/src/test/java/DictionaryLookup` | `../../../../tests/DictionaryLookup` |❗`..` in target|omit from sdist[^3]| | `third_party/flatbuffers/java/src/test/java/MyGame` | `../../../../tests/MyGame` |❗`..` in target|omit from sdist[^3]| | `third_party/flatbuffers/java/src/test/java/NamespaceA` | `../../../../tests/namespace_test/NamespaceA` |❗`..` in target|omit from sdist[^3]| | `third_party/flatbuffers/java/src/test/java/NamespaceC` | `../../../../tests/namespace_test/NamespaceC` |❗`..` in target|omit from sdist[^3]| | `third_party/flatbuffers/java/src/test/java/optional_scalars` | `../../../../tests/optional_scalars` |❗`..` in target|omit from sdist[^3]| | `third_party/flatbuffers/java/src/test/java/union_vector` | `../../../../tests/union_vector` |❗`..` in target|omit from sdist[^3]| | `third_party/flatbuffers/kotlin/benchmark/src/jvmMain/java` | `../../../../java/src/main/java` |❗`..` in target|omit from sdist[^3]| | `third_party/ittapi/rust/ittapi-sys/c-library` | `../../` |❗`..` in target|omit from sdist[^4]| | `third_party/ittapi/rust/ittapi-sys/LICENSES` | `../../LICENSES` |❗`..` in target|omit from sdist[^4]| | `third_party/opentelemetry-cpp/buildscripts/pre-merge-commit` | `./pre-commit` |✅ ok (individual file)|| | `third_party/opentelemetry-cpp/third_party/prometheus-cpp/cmake/project-import-cmake/sample_client.cc` | `../../push/tests/integration/sample_client.cc` |❗`..` in target|omit from sdist[^5]| | `third_party/opentelemetry-cpp/third_party/prometheus-cpp/cmake/project-import-cmake/sample_server.cc` | `../../pull/tests/integration/sample_server.cc` |❗`..` in target|omit from sdist[^5]| | `third_party/opentelemetry-cpp/third_party/prometheus-cpp/cmake/project-import-pkgconfig/sample_client.cc` | `../../push/tests/integration/sample_client.cc` |❗`..` in target|omit from sdist[^5]| | `third_party/opentelemetry-cpp/third_party/prometheus-cpp/cmake/project-import-pkgconfig/sample_server.cc` | `../../pull/tests/integration/sample_server.cc` |❗`..` in target|omit from sdist[^5]| | `third_party/XNNPACK/tools/xngen` | `xngen.py` | ✅ ok (individual file)|| </details> The introduction of symbolic links inside the `.ci/docker` folder creates a new problem, however, because Docker's `COPY` command does not allow symlinks in this way. We work around that by using `tar ch` to dereference the symlinks before handing them over to `docker build`. [^1]: These resources can be naturally considered to be part of the docs, so moving the actual files into the place of the current symlinks and replacing them with (unproblematic) symlinks can be said to improve semantics as well. [^2]: The flatbuffers docs already actually use the original file, not the symlink and in the most recent releases, starting from flatbuffers-25.1.21 the symlink is replaced by the actual file thanks to a documentation overhaul. [^3]: These resources are flatbuffers tests for java and kotlin and can be omitted from our sdist. [^4]: We don't need to ship the rust bindings for ittapi. [^5]: These are demonstration examples for how to link to prometheus-cpp using cmake and can be omitted. ### Nccl Nccl used to be included as a submodule. However, with #146073 (first released in v2.7.0-rc1), the submodule was removed and replaced with a build time checkout procedure in `tools/build_pytorch_libs.py`, which checks out the required version of nccl from the upstream repository based on a commit pin recorded in `.ci/docker/ci_commit_pins/nccl-cu{11,12}.txt`. This means that a crucial third party dependency is missing from the source distribution and as the `.ci` folder is omitted from the source distribution, it is not possible to use the build time download. However, it *is* possible to use a system provided Nccl using the `USE_SYSTEM_NCCL` environment variable, which now also is the default for the official Pytorch wheels. Pull Request resolved: #152098 Approved by: https://github.com/atalman
1 parent c7b6c98 commit f16053f

17 files changed

+193
-106
lines changed

.ci/docker/build.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -383,7 +383,7 @@ if [[ -n "${CI:-}" ]]; then
383383
fi
384384

385385
# Build image
386-
docker build \
386+
tar ch . | docker build \
387387
${no_cache_flag} \
388388
${progress_flag} \
389389
--build-arg "BUILD_ENVIRONMENT=${image}" \
@@ -422,7 +422,7 @@ docker build \
422422
-f $(dirname ${DOCKERFILE})/Dockerfile \
423423
-t "$tmp_tag" \
424424
"$@" \
425-
.
425+
-
426426

427427
# NVIDIA dockers for RC releases use tag names like `11.0-cudnn9-devel-ubuntu18.04-rc`,
428428
# for this case we will set UBUNTU_VERSION to `18.04-rc` so that the Dockerfile could

.ci/docker/requirements-docs.txt

Lines changed: 0 additions & 61 deletions
This file was deleted.

.ci/docker/requirements-docs.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../../docs/requirements.txt

.github/workflows/create_release.yml

Lines changed: 42 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ jobs:
3535
contents: write
3636
outputs:
3737
pt_release_name: ${{ steps.release_name.outputs.pt_release_name }}
38+
pt_pep517_release_name: ${{ steps.release_name.outputs.pt_pep517_release_name }}
3839
steps:
3940
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
4041
with:
@@ -53,36 +54,57 @@ jobs:
5354
tag_or_branch="${tag_or_branch#refs/heads/}"
5455
# replace directory separators with _ in branch name
5556
tag_or_branch="${tag_or_branch//\//_}"
56-
echo "PT_RELEASE_NAME=pytorch-$tag_or_branch" >> "$GITHUB_ENV"
57-
echo "PT_RELEASE_FILE=pytorch-$tag_or_branch.tar.gz" >> "$GITHUB_ENV"
57+
torch_version="$(python -c 'from tools.generate_torch_version import get_torch_version; print(get_torch_version())')"
58+
{
59+
echo "PT_RELEASE_NAME=pytorch-$tag_or_branch";
60+
echo "PT_RELEASE_FILE=pytorch-$tag_or_branch.tar.gz";
61+
echo "PT_PEP517_RELEASE_FILE=torch-${torch_version}.tar.gz";
62+
} >> "$GITHUB_ENV"
5863
- name: Checkout optional submodules
5964
run: python3 tools/optional_submodules.py
6065
- name: Create source distribution
6166
run: |
62-
# Create new folder with specified name so extracting the archive yields that
63-
rm -rf "/tmp/$PT_RELEASE_NAME"
64-
cp -r "$PWD" "/tmp/$PT_RELEASE_NAME"
65-
mv "/tmp/$PT_RELEASE_NAME" .
66-
# Cleanup
67-
rm -rf "$PT_RELEASE_NAME"/{.circleci,.ci}
68-
find "$PT_RELEASE_NAME" -name '.git*' -exec rm -rv {} \; || true
69-
# Create archive
70-
tar -czf "$PT_RELEASE_FILE" "$PT_RELEASE_NAME"
71-
echo "Created source archive $PT_RELEASE_FILE with content: $(ls -a "$PT_RELEASE_NAME")"
67+
# Create new folder with specified name so extracting the archive yields that
68+
rm -rf "/tmp/$PT_RELEASE_NAME"
69+
cp -r "$PWD" "/tmp/$PT_RELEASE_NAME"
70+
mv "/tmp/$PT_RELEASE_NAME" .
71+
# Cleanup
72+
rm -rf "$PT_RELEASE_NAME"/{.circleci,.ci}
73+
find "$PT_RELEASE_NAME" -name '.git*' -exec rm -rv {} \; || true
74+
# Create archive
75+
tar -czf "$PT_RELEASE_FILE" "$PT_RELEASE_NAME"
76+
echo "Created source archive $PT_RELEASE_FILE with content: $(ls -a "$PT_RELEASE_NAME")"
77+
- name: Create PEP 517 compatible source distribution
78+
run: |
79+
pip install build==1.2.2.post1 || exit 1
80+
python -m build --sdist || exit 1
81+
cd dist || exit 1
7282
- name: Upload source distribution for release
7383
if: ${{ github.event_name == 'release' }}
7484
uses: softprops/action-gh-release@da05d552573ad5aba039eaac05058a918a7bf631 # v2.2.2
7585
with:
76-
files: ${{env.PT_RELEASE_FILE}}
86+
files: |
87+
${{ env.PT_RELEASE_FILE }}
88+
${{ env.PT_PEP517_RELEASE_FILE }}
7789
- name: Upload source distribution to GHA artifacts for release tags
7890
if: ${{ github.event_name == 'push' && startsWith(github.ref, 'refs/tags/v') && contains(github.ref, 'rc') }}
7991
uses: actions/upload-artifact@50769540e7f4bd5e21e526ee35c689e35e0d6874 # v4.4.0
8092
with:
8193
name: ${{ env.PT_RELEASE_FILE }}
8294
path: ${{ env.PT_RELEASE_FILE }}
95+
- name: Upload PEP 517 source distribution to GHA artifacts for release tags
96+
if: ${{ github.event_name == 'push' && startsWith(github.ref, 'refs/tags/v') && contains(github.ref, 'rc') }}
97+
uses: actions/upload-artifact@50769540e7f4bd5e21e526ee35c689e35e0d6874 # v4.4.0
98+
with:
99+
name: ${{ env.PT_PEP517_RELEASE_FILE }}
100+
path: dist/${{ env.PT_PEP517_RELEASE_FILE }}
83101
- name: Set output
84102
id: release_name
85-
run: echo "name=pt_release_name::${{ env.PT_RELEASE_NAME }}.tar.gz" >> "${GITHUB_OUTPUT}"
103+
run: |
104+
{
105+
echo "name=pt_release_name::${{ env.PT_RELEASE_FILE }}";
106+
echo "name=pt_pep517_release_name::${{ env.PT_PEP517_RELEASE_FILE }}";
107+
} >> "${GITHUB_OUTPUT}"
86108
87109
upload_source_code_to_s3:
88110
if: ${{ github.repository == 'pytorch/pytorch' && github.event_name == 'push' && startsWith(github.ref, 'refs/tags/v') && contains(github.ref, 'rc') }}
@@ -98,6 +120,9 @@ jobs:
98120
- uses: actions/download-artifact@65a9edc5881444af0b9093a5e628f2fe47ea3b2e # v4.1.7
99121
with:
100122
name: ${{ needs.release.outputs.pt_release_name }}
123+
- uses: actions/download-artifact@65a9edc5881444af0b9093a5e628f2fe47ea3b2e # v4.1.7
124+
with:
125+
name: ${{ needs.release.outputs.pt_pep517_release_name }}
101126
- name: Configure AWS credentials(PyTorch account)
102127
uses: aws-actions/configure-aws-credentials@ececac1a45f3b08a01d2dd070d28d111c5fe6722 # v4.1.0
103128
with:
@@ -108,7 +133,9 @@ jobs:
108133
s3-bucket: pytorch
109134
s3-prefix: source_code/test
110135
if-no-files-found: warn
111-
path: ${{ needs.release.outputs.pt_release_name }}
136+
path: |
137+
${{ needs.release.outputs.pt_release_name }}
138+
${{ needs.release.outputs.pt_pep517_release_name }}
112139
113140
concurrency:
114141
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.sha }}-${{ github.event_name }}

MANIFEST.in

Lines changed: 86 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1,31 +1,91 @@
1+
# Include individual top-level files
12
include MANIFEST.in
2-
include CMakeLists.txt
3+
include BUCK.oss
4+
include BUILD.bazel
35
include CITATION.cff
6+
include CODEOWNERS
7+
include Dockerfile
48
include LICENSE
9+
include Makefile
510
include NOTICE
6-
include .gitmodules
7-
include build_variables.bzl
8-
include mypy.ini
9-
include requirements.txt
10-
include ufunc_defs.bzl
11-
include version.txt
12-
recursive-include android *.*
13-
recursive-include aten *.*
14-
recursive-include binaries *.*
15-
recursive-include c10 *.*
16-
recursive-include caffe2 *.*
17-
recursive-include cmake *.*
18-
recursive-include torch *.*
19-
recursive-include tools *.*
20-
recursive-include test *.*
21-
recursive-include docs *.*
22-
recursive-include ios *.*
23-
recursive-include third_party *
24-
recursive-include test *.*
25-
recursive-include benchmarks *.*
26-
recursive-include scripts *.*
27-
recursive-include mypy_plugins *.*
28-
recursive-include modules *.*
29-
recursive-include functorch *.*
11+
include WORKSPACE
12+
include .bazelignore .bazelrc .bazelversion
13+
include .clang-format .clang-tidy
14+
include .cmakelintrc
15+
include .coveragerc
16+
include .dockerignore
17+
include .flake8
18+
include .gdbinit
19+
include .lintrunner.toml
20+
include .lldbinit
21+
include docker.Makefile
22+
include ubsan.supp
23+
24+
# Include bazel related files
25+
include *.bzl
26+
# Include general configuration files
27+
include *.ini
28+
# Include important top-level information
29+
include *.md
30+
# Include technical text files
31+
include *.txt
32+
33+
# Include ctags configuration
34+
include .ctags.d/*.ctags
35+
36+
# Include subfolders completely
37+
graft .devcontainer
38+
graft .vscode
39+
graft android
40+
# The following folder (assets) is empty except for a .gitignore file, which
41+
# will not be included in the sdist, hence we include the directory explicitly.
42+
include android/test_app/app/src/main/assets
43+
graft aten
44+
graft binaries
45+
graft c10
46+
graft caffe2
47+
graft cmake
48+
graft torch
49+
graft tools
50+
graft test
51+
graft docs
52+
graft ios
53+
graft third_party
54+
graft test
55+
graft benchmarks
56+
graft scripts
57+
graft mypy_plugins
58+
graft modules
59+
graft functorch
60+
graft torchgen
61+
62+
# The following exclusions omit parts from third-party dependencies that
63+
# contain invalid symlinks[1] and that are not needed for pytorch, such as
64+
# bindings for unused languages
65+
prune third_party/ittapi/rust
66+
prune third_party/flatbuffers/java
67+
prune third_party/flatbuffers/kotlin
68+
prune third_party/nccl/pkg/debian
69+
prune third_party/opentelemetry-cpp/third_party/prometheus-cpp/cmake/project-import-*
70+
71+
# The following document is also an invalid symlink[1] and superfluous
72+
exclude third_party/flatbuffers/docs/source/CONTRIBUTING.md
73+
74+
# Omit autogenerated code
75+
prune torchgen/packaged
76+
77+
# Omit caches, compiled, and scm related content
3078
prune */__pycache__
31-
global-exclude *.o *.so *.dylib *.a .git *.pyc *.swp
79+
prune **/.github
80+
prune **/.gitlab
81+
global-exclude *.o *.so *.dylib *.a
82+
global-exclude *.pyc *.swp
83+
global-exclude .git .git-blame-ignore-revs .gitattributes .gitignore .gitmodules
84+
global-exclude .gitlab-ci.yml
85+
86+
# [1] Invalid symlinks for the purposes of Python source distributions are,
87+
# according to the source distribution format[2] links pointing outside the
88+
# destination directory or links with a `..` component, which is those of
89+
# concern here.
90+
91+
# [2] https://packaging.python.org/en/latest/specifications/source-distribution-format/#source-distribution-archive-features

docs/requirements.txt

Lines changed: 0 additions & 1 deletion
This file was deleted.

docs/requirements.txt

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
sphinx==5.3.0
2+
#Description: This is used to generate PyTorch docs
3+
#Pinned versions: 5.3.0
4+
-e git+https://github.com/pytorch/pytorch_sphinx_theme.git@pytorch_sphinx_theme2#egg=pytorch_sphinx_theme2
5+
6+
# TODO: sphinxcontrib.katex 0.9.0 adds a local KaTeX server to speed up pre-rendering
7+
# but it doesn't seem to work and hangs around idly. The initial thought is probably
8+
# something related to Docker setup. We can investigate this later
9+
10+
sphinxcontrib.katex==0.8.6
11+
#Description: This is used to generate PyTorch docs
12+
#Pinned versions: 0.8.6
13+
14+
sphinxext-opengraph==0.9.1
15+
#Description: This is used to generate PyTorch docs
16+
#Pinned versions: 0.9.1
17+
18+
sphinx_sitemap==2.6.0
19+
#Description: This is used to generate sitemap for PyTorch docs
20+
#Pinned versions: 2.6.0
21+
22+
matplotlib==3.5.3 ; python_version < "3.13"
23+
matplotlib==3.6.3 ; python_version >= "3.13"
24+
#Description: This is used to generate PyTorch docs
25+
#Pinned versions: 3.6.3 if python > 3.12. Otherwise 3.5.3.
26+
27+
tensorboard==2.13.0 ; python_version < "3.13"
28+
tensorboard==2.18.0 ; python_version >= "3.13"
29+
#Description: This is used to generate PyTorch docs
30+
#Pinned versions: 2.13.0
31+
32+
breathe==4.34.0
33+
#Description: This is used to generate PyTorch C++ docs
34+
#Pinned versions: 4.34.0
35+
36+
exhale==0.2.3
37+
#Description: This is used to generate PyTorch C++ docs
38+
#Pinned versions: 0.2.3
39+
40+
docutils==0.16
41+
#Description: This is used to generate PyTorch C++ docs
42+
#Pinned versions: 0.16
43+
44+
bs4==0.0.1
45+
#Description: This is used to generate PyTorch C++ docs
46+
#Pinned versions: 0.0.1
47+
48+
IPython==8.12.0
49+
#Description: This is used to generate PyTorch functorch docs
50+
#Pinned versions: 8.12.0
51+
52+
myst-nb==0.17.2
53+
#Description: This is used to generate PyTorch functorch docs
54+
#Pinned versions: 0.13.2
55+
56+
# The following are required to build torch.distributed.elastic.rendezvous.etcd* docs
57+
python-etcd==0.4.5
58+
sphinx-copybutton==0.5.0
59+
sphinx-design==0.4.0
60+
sphinxcontrib-mermaid==1.0.0
61+
myst-parser==0.18.1

functorch/docs/source/notebooks

Lines changed: 0 additions & 1 deletion
This file was deleted.

0 commit comments

Comments
 (0)