Skip to content

Upstream downstream patches for allowing to make a relocatable (standalone) Python installation #119696

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
FFY00 opened this issue May 28, 2024 · 20 comments
Labels
topic-installation type-feature A feature request or enhancement

Comments

@FFY00
Copy link
Member

FFY00 commented May 28, 2024

Feature or enhancement

Proposal:

I am currently looking at https://github.com/indygreg/python-build-standalone/tree/main/cpython-unix

This would be the first step towards python.org possibly providing a binary build for Linux.

cc @jezdez @indygreg

Has this already been discussed elsewhere?

No response given

Links to previous discussion of this feature:

No response

@FFY00 FFY00 added the type-feature A feature request or enhancement label May 28, 2024
@vstinner
Copy link
Member

vstinner commented Jun 3, 2024

You should elaborate the issue a little bit :-) Which changes do you want to upstream? To do what?

@FFY00
Copy link
Member Author

FFY00 commented Jun 11, 2024

Yes, sorry! And sorry for not seeing this earlier, Github notifications are a mess. The goal would be to ultimately have a fully relocatable Python instalation (there's some discussion on GH-62509). This issue is specific is to track the work on understanding which downstream pacthes make sense to be upstreamed, and to upstream them.

https://github.com/indygreg/python-build-standalone/tree/main/cpython-unix is a good starting point.

My thinking here was to not re-do all the work, and base on downstream projects already doing this. Does that make sense?

@vstinner
Copy link
Member

There are around 26 patches, it's scary. I suppose that we can decide on a case by case basis. But having an overall rationale for what is a "relocatable Python installation" would help.

@FFY00
Copy link
Member Author

FFY00 commented Jun 11, 2024

Well, I'd like to at some point being able to ship official binaries for Linux on python.org, that's my main motivation. Relocatable installs also make a lot of sense for applications like embedding, etc.

@vstinner
Copy link
Member

Can you please describe what is a "relocatable Python installation"? What are the current issues that you're trying to solve? Why is it important for Python to support such installation?

@warsaw
Copy link
Member

warsaw commented Jul 23, 2024

Can you please describe what is a "relocatable Python installation"? What are the current issues that you're trying to solve? Why is it important for Python to support such installation?

Maybe this should be a PEP?

@zanieb
Copy link
Contributor

zanieb commented Sep 6, 2024

Hi!

As commented in the PEP 711 PyBI discussion, standalone Python distributions are being used today in a several tools and by many users. For example, Hatch, PyApp, Rye, and uv all use indygreg's standalone builds. I think this is demonstrative of some need.

@mitsuhiko has some documentation about the use of standalone builds in Rye:

The motivation for this is that it makes it easy to switch between Python versions, to have a common experience across different Rye users and to avoid odd bugs caused by changes in behavior.

Unlike many other Python versions you can install on your computer are non-portable which means that if you move them to a new location on your machine, or you copy it onto another computer (even with the same operating system) they will no longer run. This is undesirable for what Rye wants to do. For one we want the same experience for any of the Python developers, no matter which operating system they used. Secondly we want to enable self-contained Python builds later, which requires that the Python installation is portable.

@ofek uses standalone builds in PyApp to create bootstrapped standalone Python applications.

For use in uv (which I maintain), it is particularly important for us to be able to download optimized builds and place them in an arbitrary location on the user's machine. We need support for many Python versions. We don't want the user to need to install dependencies other than uv itself. We don't want behavior of Python to change depending on the versions of libraries on the user's machine. Compiling Python on the user's machine isn't compatible with those goals. We can't invoke distribution-specific package managers because they require root we don't want to require root to bootstrap Python.

I'm not an expert and I don't have context on why CPython builds use absolute paths today. I'm sure it's a complicated topic! I have some team members that can probably provide some more detail, but I figured I'd take a swing at an initial answer.

While writing this, I found some previous discussion here if you want more context. I also thought there was some relevant discussion in PEP 711.

@zooba
Copy link
Member

zooba commented Sep 6, 2024

I'm not an expert and I don't have context on why CPython builds use absolute paths today.

It's just platform convention. All the paths on Windows are relative by default, because that's how that platform works, but other platforms use absolute references by default.

Python's build system was also designed to integrate into an OS (I undid a lot of that for Windows back around 3.5), not to be a standalone app. Since that was basically the only distribution mechanism back in the 90's, it makes perfect sense, but it should certainly be open to updating.

@zanieb
Copy link
Contributor

zanieb commented Sep 6, 2024

That's good to know, thanks for sharing that additional context. It makes sense that it's mostly historical and aligned with the distribution mechanisms of the time. I think it's fair to say that lately there is an increasing focus on portable distribution mechanisms for applications and it'd be great to see Python move towards supporting those use-cases. I'm curious if there are reasons this wouldn't be desirable? (Other than the obvious large amount of work necessary to get there)

@zooba
Copy link
Member

zooba commented Sep 9, 2024

The main reason that I find it undesirable is that portable (particularly "many-Linux" portable) distributions will put the build, maintenance, and release burden back on the volunteer team. Right now, it's either handled by paid or volunteer employees of the companies that produce their own distributions.

If we were to maintain our current policy (no binaries for security-fix only releases) but get users onto our own portable releases, many users would find themselves stuck when they need to switch to a different release mechanism to get trustworthy security fixes. It's possible that new companies would start distributing portable builds with fixes integrated (it's worth noting that Anaconda already does this), but I expect most Linux distros would stick with the builds that integrate into the rest of their OS.

Now, don't get me wrong, I think most Python users should avoid their distro's own build of Python (it's usually the system Python, meant for system apps, and the user's app shouldn't blindly adopt the system's one). But I don't think that most are ready to do it, and at best will end up relying on repackaged builds from upstream. So there's a pretty significant community dynamic here that is not obviously going to work out in users' favour.

All that said, having the capability to build a relocatable install would be great. I have my own set of build hacks to make it possible for my $WORK scenarios, so I'd love to drop those. Making relocatable releases is actually a totally separate question, but as it's the motivation for having the changes upstream, until somebody says they're going to make those releases, it looks like pretty pointless churn unfortunately.

@vstinner
Copy link
Member

The cpython-unix/ directory of https://github.com/indygreg/python-build-standalone contains 37 patches. Which patches are the most important? I don't see which patch magically makes Python relocatable. Is it another script which makes it relocatable?

@zanieb
Copy link
Contributor

zanieb commented Sep 10, 2024

That's a good question — I can try to answer it but I'll need to dig around quite a bit since I didn't write the project. Let me loop in @indygreg, if he has time.

@zooba
Copy link
Member

zooba commented Sep 10, 2024

This is the most critical (native) part:

https://github.com/indygreg/python-build-standalone/blob/ef71d13ba98706995ef056801fea1e88e172d7c6/cpython-unix/build-cpython.sh#L516-L566

Changes in the build script would presumably pass different linker options rather than patching the built binaries.

I believe most of the other changes are to clean up assumptions within the stdlib (primarily sysconfig I'd imagine) about the layout.

@tanzim
Copy link

tanzim commented Nov 20, 2024

Did I get the TL;DR; right on this?

  • CPython maintainers are happy to accept patches and work with contributors to make CPython relocatable on Linux (and hopefully macOS too)
  • The stated goal of distributing standalone relocatable distributions via python.org is out of scope as put extra burden on the maintainers

If that's the case makes a lot of sense to me. I too for $WORK create embeddable and relocatable Python distributions for macOS and Windows and would love it if I didn't have to maintain my own patches/hacks etc. Having the support in CPython would be invaluable.

@vstinner
Copy link
Member

CPython maintainers are happy to accept patches and work with contributors to make CPython relocatable on Linux (and hopefully macOS too)

So far, no pull request was proposed.

@FFY00
Copy link
Member Author

FFY00 commented Dec 7, 2024

Maybe this should be a PEP?

Perhaps, I think there are two parts of this work:

  1. Make sure the Python runtime works on relocatable builds
  2. Add support for relocatable builds in the build system

Since I don't think we currently rely on the build prefix for path calculation, 1) should already be supported, and shouldn't require any major changes. Runtime issues relocatable builds hit are likely bugs anyway. Considering this, I think 1) shouldn't need a PEP, as long as we don't run into any major issues or design constrains while working on it.

Regarding 2), I think that might need PEP. Though, if there are any build system improvements we can make to help minimize the patching required for projects like python-build-standalone, I think it should be okay to do so, but that can be evaluated on a case-by-case basis.


CPython maintainers are happy to accept patches and work with contributors to make CPython relocatable on Linux (and hopefully macOS too)

Yes. You can ping me to review such contributions.

The stated goal of distributing standalone relocatable distributions via python.org is out of scope as put extra burden on the maintainers

I don't think that's out of the picture, but it's out of scope for this issue, and it would require a PEP. It's something to maybe consider if the work covered in this issue is successful, but for now we should focus on bridging the gap between the upstream and dowstreams, given how critical to the Python ecosystem this use-case is.

@zooba
Copy link
Member

zooba commented Dec 9, 2024

Regarding 2), I think that might need PEP. Though, if there are any build system improvements we can make to help minimize the patching required for projects like python-build-standalone, I think it should be okay to do so, but that can be evaluated on a case-by-case basis.

Nah, provided it doesn't change the default or break any existing options, we can add support to the build system like any other feature. I would like to be able to argue for certain changes to be treated as bugfixes, since they're important to users who build from source and so are very impactful beyond our maintenance period, but we certainly shouldn't need a PEP to introduce improvements.

@emmatyping
Copy link
Member

emmatyping commented Feb 7, 2025

I'm interested in helping to make a relocatable Linux build of CPython a supported build option. I went through the build script and patches and here are the patches I think should be applied or adopted in some way to enable that. Note this list does not include a number of patches which should probably be upstreamed related to macOS and Linux cross compiling. It also only includes patches targeting the latest CPython.

  • patch-make-testembed-nolink-tcltk.patch - This patch handles the situation where testembed and libpython link against Tcl/Tk when built statically. In a static build, this causes duplicate symbols, so testembed should not link against Tcl/Tk when libpython links to modules statically.

  • patch-python-link-modules-3.11.patch - This patch is targeted towards macOS, but I think it motivates changes more generally:

    # Also on macOS, the `python` executable is linked against libraries defined by statically
    # linked modules. But those libraries should only get linked into libpython, not the
    # executable. This behavior is kinda suspect on all platforms, as it could be adding
    # library dependencies that shouldn't need to be there.
    

    I think I would agree it would be better to link libraries against libpython rather than the python executable. This is a pretty significant change however, so I don't know if it is worth the risk, especially if we're just focusing on Linux for now.

  • patch-tkinter-3.13.patch - This patch updates how unix builds find Tcl/Tk libraries to search in a few more places, which seems useful for a portable build.

  • patch-ctypes-static-binary.patch - This patches resolves import ctypes fails with a statically linked interpreter due to dlopen() failure #81241 by simply not making the ctypes.pythonapi object available. I'm not totally sure what the best path forward here is, but perhaps just noting it is unavailable on static CPython builds in documentation is sufficient?

  • patch-configure-disable-stdlib-mod-3.12 and patch-pwd-remove-conditional.patch - These patches hack around not using the configure script's module enable/disable logic. The patch comments say:

    # Python 3.11 has configure support for configuring extension modules. We really,
    # really, really want to use this feature because it looks promising. But at the
    # time we added this code the functionality didn't support all extension modules
    # nor did it easily support static linking, including static linking of extra
    # libraries (which appears to be a limitation of `makesetup`). So for now we
    # disable the functionality and require our auto-generated Setup.local to provide
    # everything.
    

    So to me this seems like the work left to do is make sure that the module enable/disable logic and makesetup script support static linking. This also affects patch-checksharedmods-disable.patch, which disables the shared module check because of the above patches.

  • patch-test-embed-prevent-segfault.patch - This may be an issue with the BOLT optimization when CPython is compiled statically? This needs more investigation to figure out why there is a crash.

I'll take a look at the more obvious ones (e.g. addressing #81241) as a start.

@geofft
Copy link
Contributor

geofft commented Feb 11, 2025

I think there's a couple of different definitions of "relocatable". For instance, patch-ctypes-static-binary seems to be specifically needed where the Python interpreter is fully statically linked (does not use the ELF interpreter, ld.so), but our current releases are dynamically linked (uses the systemwide ld.so) and so I think this patch is not needed. There are probably a couple of other patches in python-build-standalone/cpython-unix/ that aren't relevant for what python-build-standalone is doing these days.

Another distinction is whether third-party libraries that are used by Python (zlib, Expat, Tcl/Tk, etc.) are statically linked into libpython or the Python binary (which is orthogonal to whether the Python binary itself uses ld.so or not), and if not, whether they're picked up from the system or from something shipped along with the relocatable Python installation. I think python-build-standalone could ship all of these third-party libraries as separate shared libraries, but it's just that because of its history as targeting a static single-file interpreter, it doesn't.

Out of curiosity, @emmatyping, do you have some specific use case in mind for a relocatable Python?

@emmatyping
Copy link
Member

I think there's a couple of different definitions of "relocatable".

Very true! In my mind it is merely "if I take this install, and scp it to another machine, it should continue to work". Which implies a lot about runtime dependency and module loading (e.g. all dependent libraries are consistently loaded).

For instance, patch-ctypes-static-binary seems to be specifically needed where the Python interpreter is fully statically linked (does not use the ELF interpreter, ld.so), but our current releases are dynamically linked (uses the systemwide ld.so) and so I think this patch is not needed.

I'll check if this is the case, it would make sense it isn't needed.

I think python-build-standalone could ship all of these third-party libraries as separate shared libraries, but it's just that because of its history as targeting a static single-file interpreter, it doesn't.

I'm a little concerned ld.so.cache may cause problems here. In a scenario where libpython.so is embedded in a program, if the program loads any library of the same SONAME, the bundled dependent library wouldn't be used, which could cause errors about missing symbols if the system library is older than the one used for the relocatable Python.

Out of curiosity, @emmatyping, do you have some specific use case in mind for a relocatable Python?

I would love to see an official relocatable build of CPython for Linux for many reasons (not needing to build CPython myself to get a version my OS doesn't package, a more official/signed equivalent to python-build-standalone, etc.) . I realize working on this won't mean that happens, and that'd probably require a PEP, but I think it is generally useful if a relocatable Linux (and probably macOS as well) build is a supported build configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-installation type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

9 participants