Skip to content

gh-125498: Update JIT builds to use LLVM 19 and use preserve_none #125499

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 48 commits into from
Oct 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
94252cf
Update to LLVM 19
savannahostrowski Sep 14, 2024
f8dc236
update syntax in disabled gil ci
savannahostrowski Sep 14, 2024
80a3b40
📜🤖 Added by blurb_it.
blurb-it[bot] Sep 14, 2024
ba02d7c
Merge branch 'main' into jit-llvm-19
savannahostrowski Sep 25, 2024
7a1133e
Update readme
savannahostrowski Sep 25, 2024
d8e38db
fix free-threaded by pseudo-pinning version
savannahostrowski Sep 25, 2024
b8ae218
Add check to see that registers match in stencil generation
savannahostrowski Sep 26, 2024
4368d5f
Appease linters
savannahostrowski Sep 26, 2024
12fc5cd
Remove devcontainer instructions from readme
savannahostrowski Sep 26, 2024
7f9fe5a
Update README
savannahostrowski Sep 26, 2024
8c21729
Merge branch 'main' into jit-llvm-19
savannahostrowski Oct 1, 2024
a597ea5
Remove ghccc
savannahostrowski Oct 13, 2024
4df5efc
Merge branch 'main' into remove-ghccc
savannahostrowski Oct 13, 2024
85b858d
add back cpu_count
savannahostrowski Oct 13, 2024
9209651
Appease linter
savannahostrowski Oct 13, 2024
a842f90
Add sys import
savannahostrowski Oct 13, 2024
73c725b
Move preserve_none
savannahostrowski Oct 13, 2024
4d8a012
define jit_func_preserve_none
savannahostrowski Oct 15, 2024
994af97
add comment
savannahostrowski Oct 15, 2024
0d86727
add comment
savannahostrowski Oct 15, 2024
98f0535
Fix whitespace
savannahostrowski Oct 15, 2024
9a20a2e
Move header to separate file
savannahostrowski Oct 15, 2024
4a2f3c4
Add newline
savannahostrowski Oct 15, 2024
4c4ca2f
Add newline
savannahostrowski Oct 15, 2024
7d1745a
Address PR comments
savannahostrowski Oct 16, 2024
c8d4692
Replace entry_symbol with string
savannahostrowski Oct 16, 2024
72d5ed0
Appease linter
savannahostrowski Oct 16, 2024
a96af70
Add newline
savannahostrowski Oct 16, 2024
9827ade
Merge branch 'main' into remove-ghccc
savannahostrowski Oct 16, 2024
4e32743
Merge branch 'main' into remove-ghccc
savannahostrowski Oct 18, 2024
ed29ae2
Merge branch 'main' into remove-ghccc
savannahostrowski Oct 18, 2024
3a2ecee
Merge branch 'main' into remove-ghccc
savannahostrowski Oct 21, 2024
3259994
Merge branch 'main' into remove-ghccc
savannahostrowski Oct 22, 2024
5ef69e6
📜🤖 Added by blurb_it.
blurb-it[bot] Oct 22, 2024
709bb08
Rephrase
savannahostrowski Oct 22, 2024
5ca8d61
Run pre-commit
savannahostrowski Oct 22, 2024
24d9143
Add newline
savannahostrowski Oct 22, 2024
b351303
Add line to remove symlink
savannahostrowski Oct 23, 2024
53ec962
Merge branch 'main' into remove-ghccc
savannahostrowski Oct 23, 2024
fe58a12
Fix typo
savannahostrowski Oct 23, 2024
a4b2d3e
Merge branch 'remove-ghccc' of https://github.com/savannahostrowski/c…
savannahostrowski Oct 23, 2024
0f88955
Fix wording
savannahostrowski Oct 23, 2024
4e16dd6
Merge branch 'main' into remove-ghccc
savannahostrowski Oct 25, 2024
bb1e650
Update Misc/NEWS.d/next/Core_and_Builtins/2024-10-22-04-18-53.gh-issu…
savannahostrowski Oct 29, 2024
5f3ec52
Apply suggestions from code review
savannahostrowski Oct 29, 2024
46cee93
Address PR comments
savannahostrowski Oct 29, 2024
c119a5d
Fix whitespace
savannahostrowski Oct 29, 2024
cbb0ddf
Add newline to jit.h
savannahostrowski Oct 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 12 additions & 3 deletions .github/workflows/jit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ jobs:
- true
- false
llvm:
- 18
- 19
include:
- target: i686-pc-windows-msvc/msvc
architecture: Win32
Expand Down Expand Up @@ -121,10 +121,15 @@ jobs:
choco install llvm --allow-downgrade --no-progress --version ${{ matrix.llvm }}.1.0
./PCbuild/build.bat --experimental-jit ${{ matrix.debug && '-d' || '' }} -p ${{ matrix.architecture }}

# The `find` line is required as a result of https://github.com/actions/runner-images/issues/9966.
# This is a bug in the macOS runner image where the pre-installed Python is installed in the same
# directory as the Homebrew Python, which causes the build to fail for macos-13. This line removes
# the symlink to the pre-installed Python so that the Homebrew Python is used instead.
- name: Native macOS
if: runner.os == 'macOS'
run: |
brew update
find /usr/local/bin -lname '*/Library/Frameworks/Python.framework/*' -delete
brew install llvm@${{ matrix.llvm }}
SDKROOT="$(xcrun --show-sdk-path)" \
./configure --enable-experimental-jit ${{ matrix.debug && '--with-pydebug' || '--enable-optimizations --with-lto' }}
Expand Down Expand Up @@ -165,15 +170,19 @@ jobs:
name: Free-Threaded (Debug)
needs: interpreter
runs-on: ubuntu-latest
strategy:
matrix:
llvm:
- 19
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Build with JIT enabled and GIL disabled
run: |
sudo bash -c "$(wget -O - https://apt.llvm.org/llvm.sh)" ./llvm.sh 18
export PATH="$(llvm-config-18 --bindir):$PATH"
sudo bash -c "$(wget -O - https://apt.llvm.org/llvm.sh)" ./llvm.sh ${{ matrix.llvm }}
export PATH="$(llvm-config-${{ matrix.llvm }} --bindir):$PATH"
./configure --enable-experimental-jit --with-pydebug --disable-gil
make all --jobs 4
- name: Run tests
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Update JIT compilation to use LLVM 19
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
The JIT has been updated to leverage Clang 19’s new ``preserve_none`` attribute,
which supports more platforms and is more useful than LLVM's existing ``ghccc``
calling convention. This also removes the need to manually patch the calling
convention in LLVM IR, simplifying the JIT compilation process.
21 changes: 9 additions & 12 deletions Tools/jit/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,49 +7,46 @@ This version of CPython can be built with an experimental just-in-time compiler[

The JIT compiler does not require end users to install any third-party dependencies, but part of it must be *built* using LLVM[^why-llvm]. You are *not* required to build the rest of CPython using LLVM, or even the same version of LLVM (in fact, this is uncommon).

LLVM version 18 is required. Both `clang` and `llvm-readobj` need to be installed and discoverable (version suffixes, like `clang-18`, are okay). It's highly recommended that you also have `llvm-objdump` available, since this allows the build script to dump human-readable assembly for the generated code.
LLVM version 19 is required. Both `clang` and `llvm-readobj` need to be installed and discoverable (version suffixes, like `clang-19`, are okay). It's highly recommended that you also have `llvm-objdump` available, since this allows the build script to dump human-readable assembly for the generated code.

It's easy to install all of the required tools:

### Linux

Install LLVM 18 on Ubuntu/Debian:
Install LLVM 19 on Ubuntu/Debian:

```sh
wget https://apt.llvm.org/llvm.sh
chmod +x llvm.sh
sudo ./llvm.sh 18
sudo ./llvm.sh 19
```

Install LLVM 18 on Fedora Linux 40 or newer:
Install LLVM 19 on Fedora Linux 40 or newer:

```sh
sudo dnf install 'clang(major) = 18' 'llvm(major) = 18'
sudo dnf install 'clang(major) = 19' 'llvm(major) = 19'
```

### macOS

Install LLVM 18 with [Homebrew](https://brew.sh):
Install LLVM 19 with [Homebrew](https://brew.sh):

```sh
brew install llvm@18
brew install llvm@19
```

Homebrew won't add any of the tools to your `$PATH`. That's okay; the build script knows how to find them.

### Windows

Install LLVM 18 [by searching for it on LLVM's GitHub releases page](https://github.com/llvm/llvm-project/releases?q=18), clicking on "Assets", downloading the appropriate Windows installer for your platform (likely the file ending with `-win64.exe`), and running it. **When installing, be sure to select the option labeled "Add LLVM to the system PATH".**
Install LLVM 19 [by searching for it on LLVM's GitHub releases page](https://github.com/llvm/llvm-project/releases?q=19), clicking on "Assets", downloading the appropriate Windows installer for your platform (likely the file ending with `-win64.exe`), and running it. **When installing, be sure to select the option labeled "Add LLVM to the system PATH".**

Alternatively, you can use [chocolatey](https://chocolatey.org):

```sh
choco install llvm --version=18.1.6
choco install llvm --version=19.1.0
```

### Dev Containers

If you are working CPython in a [Codespaces instance](https://devguide.python.org/getting-started/setup-building/#using-codespaces), there's no need to install LLVM as the Fedora 40 base image includes LLVM 18 out of the box.

## Building

Expand Down
2 changes: 1 addition & 1 deletion Tools/jit/_llvm.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
import subprocess
import typing

_LLVM_VERSION = 18
_LLVM_VERSION = 19
_LLVM_VERSION_PATTERN = re.compile(rf"version\s+{_LLVM_VERSION}\.\d+\.\d+\S*\s+")

_P = typing.ParamSpec("_P")
Expand Down
14 changes: 13 additions & 1 deletion Tools/jit/_stencils.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import dataclasses
import enum
import sys
import typing

import _schema
Expand Down Expand Up @@ -132,15 +133,26 @@ class Hole:
def __post_init__(self) -> None:
self.func = _PATCH_FUNCS[self.kind]

def fold(self, other: typing.Self) -> typing.Self | None:
def fold(self, other: typing.Self, body: bytes) -> typing.Self | None:
"""Combine two holes into a single hole, if possible."""
instruction_a = int.from_bytes(
body[self.offset : self.offset + 4], byteorder=sys.byteorder
)
instruction_b = int.from_bytes(
body[other.offset : other.offset + 4], byteorder=sys.byteorder
)
reg_a = instruction_a & 0b11111
reg_b1 = instruction_b & 0b11111
reg_b2 = (instruction_b >> 5) & 0b11111

if (
self.offset + 4 == other.offset
and self.value == other.value
and self.symbol == other.symbol
and self.addend == other.addend
and self.func == "patch_aarch64_21rx"
and other.func == "patch_aarch64_12x"
and reg_a == reg_b1 == reg_b2
):
# These can *only* be properly relaxed when they appear together and
# patch the same value:
Expand Down
67 changes: 15 additions & 52 deletions Tools/jit/_targets.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@
PYTHON_EXECUTOR_CASES_C_H = CPYTHON / "Python" / "executor_cases.c.h"
TOOLS_JIT_TEMPLATE_C = TOOLS_JIT / "template.c"


_S = typing.TypeVar("_S", _schema.COFFSection, _schema.ELFSection, _schema.MachOSection)
_R = typing.TypeVar(
"_R", _schema.COFFRelocation, _schema.ELFRelocation, _schema.MachORelocation
Expand All @@ -39,7 +38,6 @@ class _Target(typing.Generic[_S, _R]):
_: dataclasses.KW_ONLY
alignment: int = 1
args: typing.Sequence[str] = ()
ghccc: bool = False
prefix: str = ""
stable: bool = False
debug: bool = False
Expand Down Expand Up @@ -88,11 +86,7 @@ async def _parse(self, path: pathlib.Path) -> _stencils.StencilGroup:
sections: list[dict[typing.Literal["Section"], _S]] = json.loads(output)
for wrapped_section in sections:
self._handle_section(wrapped_section["Section"], group)
# The trampoline's entry point is just named "_ENTRY", since on some
# platforms we later assume that any function starting with "_JIT_" uses
# the GHC calling convention:
entry_symbol = "_JIT_ENTRY" if "_JIT_ENTRY" in group.symbols else "_ENTRY"
assert group.symbols[entry_symbol] == (_stencils.HoleValue.CODE, 0)
assert group.symbols["_JIT_ENTRY"] == (_stencils.HoleValue.CODE, 0)
if group.data.body:
line = f"0: {str(bytes(group.data.body)).removeprefix('b')}"
group.data.disassembly.append(line)
Expand All @@ -112,9 +106,6 @@ def _handle_relocation(
async def _compile(
self, opname: str, c: pathlib.Path, tempdir: pathlib.Path
) -> _stencils.StencilGroup:
# "Compile" the trampoline to an empty stencil group if it's not needed:
if opname == "trampoline" and not self.ghccc:
return _stencils.StencilGroup()
o = tempdir / f"{opname}.o"
args = [
f"--target={self.triple}",
Expand All @@ -128,6 +119,7 @@ async def _compile(
f"-I{CPYTHON / 'Include' / 'internal'}",
f"-I{CPYTHON / 'Include' / 'internal' / 'mimalloc'}",
f"-I{CPYTHON / 'Python'}",
f"-I{CPYTHON / 'Tools' / 'jit'}",
"-O3",
"-c",
# This debug info isn't necessary, and bloats out the JIT'ed code.
Expand All @@ -143,44 +135,12 @@ async def _compile(
# Don't call stack-smashing canaries that we can't find or patch:
"-fno-stack-protector",
"-std=c11",
"-o",
f"{o}",
f"{c}",
*self.args,
]
if self.ghccc:
# This is a bit of an ugly workaround, but it makes the code much
# smaller and faster, so it's worth it. We want to use the GHC
# calling convention, but Clang doesn't support it. So, we *first*
# compile the code to LLVM IR, perform some text replacements on the
# IR to change the calling convention(!), and then compile *that*.
# Once we have access to Clang 19, we can get rid of this and use
# __attribute__((preserve_none)) directly in the C code instead:
ll = tempdir / f"{opname}.ll"
args_ll = args + [
# -fomit-frame-pointer is necessary because the GHC calling
# convention uses RBP to pass arguments:
"-S",
"-emit-llvm",
"-fomit-frame-pointer",
"-o",
f"{ll}",
f"{c}",
]
await _llvm.run("clang", args_ll, echo=self.verbose)
ir = ll.read_text()
# This handles declarations, definitions, and calls to named symbols
# starting with "_JIT_":
ir = re.sub(
r"(((noalias|nonnull|noundef) )*ptr @_JIT_\w+\()", r"ghccc \1", ir
)
# This handles calls to anonymous callees, since anything with
# "musttail" needs to use the same calling convention:
ir = ir.replace("musttail call", "musttail call ghccc")
# Sometimes *both* replacements happen at the same site, so fix it:
ir = ir.replace("ghccc ghccc", "ghccc")
ll.write_text(ir)
args_o = args + ["-Wno-unused-command-line-argument", "-o", f"{o}", f"{ll}"]
else:
args_o = args + ["-o", f"{o}", f"{c}"]
await _llvm.run("clang", args_o, echo=self.verbose)
await _llvm.run("clang", args, echo=self.verbose)
return await self._parse(o)

async def _build_stencils(self) -> dict[str, _stencils.StencilGroup]:
Expand Down Expand Up @@ -519,7 +479,6 @@ def _handle_relocation(

def get_target(host: str) -> _COFF | _ELF | _MachO:
"""Build a _Target for the given host "triple" and options."""
# ghccc currently crashes Clang when combined with musttail on aarch64. :(
target: _COFF | _ELF | _MachO
if re.fullmatch(r"aarch64-apple-darwin.*", host):
target = _MachO(host, alignment=8, prefix="_")
Expand All @@ -535,16 +494,20 @@ def get_target(host: str) -> _COFF | _ELF | _MachO:
]
target = _ELF(host, alignment=8, args=args)
elif re.fullmatch(r"i686-pc-windows-msvc", host):
args = ["-DPy_NO_ENABLE_SHARED"]
target = _COFF(host, args=args, ghccc=True, prefix="_")
args = [
"-DPy_NO_ENABLE_SHARED",
# __attribute__((preserve_none)) is not supported
"-Wno-ignored-attributes",
]
target = _COFF(host, args=args, prefix="_")
elif re.fullmatch(r"x86_64-apple-darwin.*", host):
target = _MachO(host, ghccc=True, prefix="_")
target = _MachO(host, prefix="_")
elif re.fullmatch(r"x86_64-pc-windows-msvc", host):
args = ["-fms-runtime-lib=dll"]
target = _COFF(host, args=args, ghccc=True)
target = _COFF(host, args=args)
elif re.fullmatch(r"x86_64-.*-linux-gnu", host):
args = ["-fpic"]
target = _ELF(host, args=args, ghccc=True)
target = _ELF(host, args=args)
else:
raise ValueError(host)
return target
2 changes: 1 addition & 1 deletion Tools/jit/_writer.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ def _dump_stencil(opname: str, group: _stencils.StencilGroup) -> typing.Iterator
if skip:
skip = False
continue
if pair and (folded := hole.fold(pair)):
if pair and (folded := hole.fold(pair, stencil.body)):
skip = True
hole = folded
yield f" {hole.as_c(part)}"
Expand Down
4 changes: 4 additions & 0 deletions Tools/jit/jit.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
// To use preserve_none in JIT builds, we need to declare a separate function
// pointer with __attribute__((preserve_none)), since this attribute may not be
// supported by the compiler used to build the rest of the interpreter.
typedef jit_func __attribute__((preserve_none)) jit_func_preserve_none;
8 changes: 5 additions & 3 deletions Tools/jit/template.c
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@

#include "ceval_macros.h"

#include "jit.h"

#undef CURRENT_OPARG
#define CURRENT_OPARG() (_oparg)

Expand Down Expand Up @@ -49,7 +51,7 @@
do { \
OPT_STAT_INC(traces_executed); \
__attribute__((musttail)) \
return ((jit_func)((EXECUTOR)->jit_side_entry))(frame, stack_pointer, tstate); \
return ((jit_func_preserve_none)((EXECUTOR)->jit_side_entry))(frame, stack_pointer, tstate); \
} while (0)

#undef GOTO_TIER_ONE
Expand All @@ -72,7 +74,7 @@ do { \
do { \
PyAPI_DATA(void) ALIAS; \
__attribute__((musttail)) \
return ((jit_func)&ALIAS)(frame, stack_pointer, tstate); \
return ((jit_func_preserve_none)&ALIAS)(frame, stack_pointer, tstate); \
} while (0)

#undef JUMP_TO_JUMP_TARGET
Expand All @@ -86,7 +88,7 @@ do { \

#define TIER_TWO 2

_Py_CODEUNIT *
__attribute__((preserve_none)) _Py_CODEUNIT *
_JIT_ENTRY(_PyInterpreterFrame *frame, _PyStackRef *stack_pointer, PyThreadState *tstate)
{
// Locals that the instruction implementations expect to exist:
Expand Down
9 changes: 4 additions & 5 deletions Tools/jit/trampoline.c
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,10 @@
#include "pycore_frame.h"
#include "pycore_jit.h"

// This is where the calling convention changes, on platforms that require it.
// The actual change is patched in while the JIT compiler is being built, in
// Tools/jit/_targets.py. On other platforms, this function compiles to nothing.
#include "jit.h"

_Py_CODEUNIT *
_ENTRY(_PyInterpreterFrame *frame, _PyStackRef *stack_pointer, PyThreadState *tstate)
_JIT_ENTRY(_PyInterpreterFrame *frame, _PyStackRef *stack_pointer, PyThreadState *tstate)
{
// This is subtle. The actual trace will return to us once it exits, so we
// need to make sure that we stay alive until then. If our trace side-exits
Expand All @@ -19,7 +18,7 @@ _ENTRY(_PyInterpreterFrame *frame, _PyStackRef *stack_pointer, PyThreadState *ts
Py_INCREF(executor);
// Note that this is *not* a tail call:
PyAPI_DATA(void) _JIT_CONTINUE;
_Py_CODEUNIT *target = ((jit_func)&_JIT_CONTINUE)(frame, stack_pointer, tstate);
_Py_CODEUNIT *target = ((jit_func_preserve_none)&_JIT_CONTINUE)(frame, stack_pointer, tstate);
Py_SETREF(tstate->previous_executor, executor);
return target;
}
Loading