Skip to content

gh-132336: Mark a few "slow path" functions used by the interpreter loop as noinline #132337

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 10, 2025

Conversation

mpage
Copy link
Contributor

@mpage mpage commented Apr 9, 2025

These are all on the slow path and probably shouldn't be inlined into the interpreter loop.

Performance improves by ~3.6% on the free-threaded build.

The default build reports a ~0.9% regression, but I suspect that's mostly noise. We can restrict this change to only the free-threaded build (where it's clearly a win) if folks are concerned, but I think it's probably worth keeping the builds in sync.

These are all the slow path and should not be inlined into the interpreter
loop. Unfortunately, they end up being inlined with LTO and the current PGO
task.
@mpage mpage added the skip news label Apr 9, 2025
@mpage mpage requested review from colesbury and Yhg1s April 10, 2025 00:09
@mpage mpage marked this pull request as ready for review April 10, 2025 00:09
@mpage mpage requested a review from markshannon as a code owner April 10, 2025 00:09
@Yhg1s
Copy link
Member

Yhg1s commented Apr 10, 2025

I'm surprised the coverage benchmark isn't even significantly slower... That's the only benchmark I can think of that could be negatively affected by this, so since it isn't, let's do it.

@Yhg1s Yhg1s merged commit 619edb8 into python:main Apr 10, 2025
55 checks passed
@bedevere-bot
Copy link

⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️

Hi! The buildbot aarch64 Fedora Stable Refleaks 3.x (tier-2) has failed when building commit 619edb8.

What do you need to do:

  1. Don't panic.
  2. Check the buildbot page in the devguide if you don't know what the buildbots are or how they work.
  3. Go to the page of the buildbot that failed (https://buildbot.python.org/#/builders/123/builds/922) and take a look at the build logs.
  4. Check if the failure is related to this commit (619edb8) or if it is a false positive.
  5. If the failure is related to this commit, please, reflect that on the issue and make a new Pull Request with a fix.

You can take a look at the buildbot page here:

https://buildbot.python.org/#/builders/123/builds/922

Failed tests:

  • test_perf_profiler

Failed subtests:

  • test_python_calls_appear_in_the_stack_if_perf_activated - test.test_perf_profiler.TestPerfProfilerWithDwarf.test_python_calls_appear_in_the_stack_if_perf_activated

Summary of the results of the build (if available):

==

Click to see traceback logs
Traceback (most recent call last):
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/Lib/test/test_perf_profiler.py", line 364, in test_python_calls_appear_in_the_stack_if_perf_activated
    self.assertIn(f"py::foo:{script}", stdout)
    ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 'py::foo:/tmp/test_python_eyl0ngbc/tmpbqvoyxzs/perftest.py' not found in 'python 1692946 1434071.307581:          1 cycles:Pu: \n\t    ffff95522ac0 _start+0x0 (/usr/lib/ld-linux-aarch64.so.1)\n\npython 1692946 1434071.307608:          1 cycles:Pu: \n\tffffaa282d37fc78 [unknown] ([unknown])\n\tffffaa282d38049c [unknown] ([unknown])\n\tffffaa282bf215e4 [unknown] ([unknown])\n\t    ffff95522ac0 _start+0x0 (/usr/lib/ld-linux-aarch64.so.1)\n\npython 1692946 1434071.308057:          1 cycles:Pu: \n\t    ffff9550eca8 _dl_map_segment+0xfc8 (inlined)\n\t    ffff9550eca8 _dl_map_segments+0xfc8 (inlined)\n\t    ffff9550eca8 _dl_map_object_from_fd+0xfc8 (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffff9550f133 _dl_map_object+0x1e7 (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffff9550a5bf openaux+0x3f (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffff95509303 _dl_catch_exception+0x63 (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffff9550ab33 _dl_map_object_deps+0x553 (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffff9552019f dl_main+0x139f (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffff9551d5ff _dl_sysdep_start+0x1df (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffff9551eb17 _dl_start_final+0x5ab (inlined)\n\t    ffff9551eb17 _dl_start+0x5ab (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffff95522ad3 _start+0x13 (/usr/lib/ld-linux-aarch64.so.1)\n\npython 1692946 1434071.308078:        356 cycles:Pu: \n\tffffaa282d37ff10 [unknown] ([unknown])\n\tffffaa282d3804cc [unknown] ([unknown])\n\tffffaa282bf215e4 [unknown] ([unknown])\n\t    ffff955246cc __munmap+0xc (inlined)\n\t    ffff9550ecbf _dl_map_segment+0xfdf (inlined)\n\t    ffff9550ecbf _dl_map_segments+0xfdf (inlined)\n\t    ffff9550ecbf _dl_map_object_from_fd+0xfdf (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffff9550f133 _dl_map_object+0x1e7 (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffff9550a5bf openaux+0x3f (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffff95509303 _dl_catch_exception+0x63 (/usr/li
x3cf (/usr/lib64/libc.so.6)\n\t          67261f _Py_SetLocaleFromEnv+0x17 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          669837 _PyPreConfig_Read+0x15f (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          6728a7 _Py_PreInitializeFromPyArgv+0x103 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          6a4c07 pymain_init+0x57 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          6a4d7f pymain_main+0xf (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          6a4e13 Py_BytesMain+0x27 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          41f337 main+0xb (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t    ffff952b625b __libc_start_call_main+0x7b (/usr/lib64/libc.so.6)\n\t    ffff952b633b __libc_start_main@@GLIBC_2.34+0x9b (/usr/lib64/libc.so.6)\n\t          41f22f _start+0x2f (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\npython 1692946 1434071.310466:     144068 cycles:Pu: \n\t          6438d4 hashtable_rehash+0x8c (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          643b87 _Py_hashtable_set+0xdb (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          55af53 intern_static+0x83 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          574e9f _PyUnicode_InternStatic+0x27 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          57569f _PyUnicode_InitStaticStrings+0x7a3 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          58d8f7 init_global_interned_strings+0x77 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          58db2f _PyUnicode_InitGlobalObjects+0x5b (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          66a983 pycore_init_global_objects+0x23 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          66afaf pycore_interp_init+0x2b (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          66b28b pyinit_config+0xaf (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          672b77 pyinit_core+0xdb (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          672c4f Py_InitializeFromConfig+0x97 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          6a4caf pymain_init+0xff (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          6a4d7f pymain_main+0xf (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          6a4e13 Py_BytesMain+0x27 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          41f337 main+0xb (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t    ffff952b625b __libc_start_call_main+0x7b (/usr/lib64/libc.so.6)\n\t    ffff952b633b __libc_start_main@@GLIBC_2.34+0x9b (/usr/lib64/libc.so.6)\n\t          41f22f _start+0x2f (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\npython 1692946 1434071.311611:    1472487 cycles:Pu: \n\t          50877c _PyMem_DebugCheckAddress+0x48 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          50886b _PyMem_DebugRawFree+0x37 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          5088e7 _PyMem_DebugFree+0x2f (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          51ee8f PyObject_Free+0x1f (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          55552b unicode


Traceback (most recent call last):
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/Lib/test/test_perf_profiler.py", line 364, in test_python_calls_appear_in_the_stack_if_perf_activated
    self.assertIn(f"py::foo:{script}", stdout)
    ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 'py::foo:/tmp/test_python_a4p9qj8m/tmpwxs1oea0/perftest.py' not found in 'python 1740294 1434584.506657:          1 cycles:Pu: \n\t    ffff9f3eeac0 _start+0x0 (/usr/lib/ld-linux-aarch64.so.1)\n\npython 1740294 1434584.506679:          1 cycles:Pu: \n\tffffaa282d37fc78 [unknown] ([unknown])\n\tffffaa282d38049c [unknown] ([unknown])\n\tffffaa282bf215e4 [unknown] ([unknown])\n\t    ffff9f3eeac0 _start+0x0 (/usr/lib/ld-linux-aarch64.so.1)\n\npython 1740294 1434584.507066:          1 cycles:Pu: \n\t    ffff9f3e00e8 _dl_relocate_object+0x188 (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffff9f3ec3d3 dl_main+0x15d3 (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffff9f3e95ff _dl_sysdep_start+0x1df (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffff9f3eab17 _dl_start_final+0x5ab (inlined)\n\t    ffff9f3eab17 _dl_start+0x5ab (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffff9f3eead3 _start+0x13 (/usr/lib/ld-linux-aarch64.so.1)\n\npython 1740294 1434584.507085:        312 cycles:Pu: \n\t    ffff9f3e0324 _dl_relocate_object+0x3c4 (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffff9f3ec3d3 dl_main+0x15d3 (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffff9f3e95ff _dl_sysdep_start+0x1df (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffff9f3eab17 _dl_start_final+0x5ab (inlined)\n\t    ffff9f3eab17 _dl_start+0x5ab (/usr/lib/ld-linux-aarch64.so.1)\n\t    ffff9f3eead3 _start+0x13 (/usr/lib/ld-linux-aarch64.so.1)\n\npython 1740294 1434584.508073:       1292 cycles:Pu: \n\t    ffff9f209240 __memcpy_generic+0x100 (/usr/lib64/libc.so.6)\n\t          51e98f _PyMem_RawWcsdup+0x4b (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          64fb73 _PyWideStringList_CopyEx+0xd7 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          65067b _PyWideStringList_Copy+0xf (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          652727 _PyConfig_Copy+0x57 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          66a70b pycore_create_interpre
table-aarch64.refleak/build/python)\n\t          66b28b pyinit_config+0xaf (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          672b77 pyinit_core+0xdb (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          672c4f Py_InitializeFromConfig+0x97 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          6a4caf pymain_init+0xff (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          6a4d7f pymain_main+0xf (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          6a4e13 Py_BytesMain+0x27 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          41f337 main+0xb (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t    ffff9f18625b __libc_start_call_main+0x7b (/usr/lib64/libc.so.6)\n\t    ffff9f18633b __libc_start_main@@GLIBC_2.34+0x9b (/usr/lib64/libc.so.6)\n\t          41f22f _start+0x2f (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\npython 1740294 1434584.509153:     161874 cycles:Pu: \n\t          5070a8 read_size_t+0x1c (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          508873 _PyMem_DebugRawFree+0x3f (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          5088e7 _PyMem_DebugFree+0x2f (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          51ee8f PyObject_Free+0x1f (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          55552b unicode_dealloc+0x123 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          5034a7 _Py_Dealloc+0x87 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          54eda7 Py_DECREF+0x63 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          55b20b intern_common+0x153 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          58e003 _PyUnicode_InternMortal+0x1b (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          58e0bf PyUnicode_InternFromString+0x1f (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          4b2517 descr_new+0x43 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          4b36e7 PyDescr_NewWrapper+0x2b (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          5366b3 add_operators+0x5f (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          536a0f type_ready_fill_dict+0x13 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          543d33 type_ready+0x93 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          543fbf init_static_type+0x1a7 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          5440c3 _PyStaticType_InitBuiltin+0x2f (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          502e93 _PyTypes_InitTypes+0xa7 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          66a9eb pycore_init_types+0x1b (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          66b08f pycore_interp_init+0x10b (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          66b28b pyinit_config+0xaf (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          672b77 pyinit_core+0xdb (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          672c4f Py_InitializeFromConfig+0x97 (/home/buildbot/buildarea/3.x.cstratak-fedora-stable-aarch64.refleak/build/python)\n\t          6

@Yhg1s
Copy link
Member

Yhg1s commented Apr 10, 2025

Buildbot failure is not related to this change, it's been sporadically failing that way for at least a month now.

@markshannon
Copy link
Member

See my comments on the issue about what we should inline or not.
Has this been tested on the tail-calling build?

It should be easy enough to #define a NO_INLINE_FT macro to use here, to avoid the slowdown on the default build.
What might be worth trying is to mark all the specializing functions as Py_NO_INLINE and leave the rest. The number of calls to
specializing functions will be over-represented in the tests. I'm not so sure about the others.

@Yhg1s
Please don't merge stuff that slows down the default build without some sort of review or agreement from outside the free-threading team.

@mpage
Why so casually dismiss a 0.9% slowdown as noise? You seem willing enough to treat a 0.9% speedup as real in other cases.
This seems more like wishful thinking rather than a fair appraisal of the data.

@Yhg1s
Copy link
Member

Yhg1s commented Apr 11, 2025

@Yhg1s Please don't merge stuff that slows down the default build without some sort of review or agreement from outside the free-threading team.

I will be very happy to adopt this as a general CPython policy, yes.

@mpage Why so casually dismiss a 0.9% slowdown as noise? You seem willing enough to treat a 0.9% speedup as real in other cases. This seems more like wishful thinking rather than a fair appraisal of the data.

A 0.9% change is very much in the noise for pyperformance as run by bench_runner on https://github.com/facebookexperimental/free-threading-benchmarking. It's in the noise for unrelated changes that affect the compiler output, and it's even in the noise for A-A tests. We don't accept 1% speedups as real without a lot more vetting, as well as reasoning why it's real, and under what circumstances. Rolling back the change isn't a problem if it turns out the slowdown is real.

@mpage
Copy link
Contributor Author

mpage commented Apr 11, 2025

@markshannon

Has this been tested on the tail-calling build?

Only for the free-threaded build, where it was 0.3% slower (what I would call neutral).

Why so casually dismiss a 0.9% slowdown as noise? You seem willing enough to treat a 0.9% speedup as real in other cases. This seems more like wishful thinking rather than a fair appraisal of the data.

I don't think that's a fair characterization. I went back through the PRs that I've submitted or have been mentioned on and have consistently treated anything less than 1% as neutral. I've always reported the numbers and said which way I thought things were leaning. In this case it seems like things might be trending slightly negatively, but the most negatively affected benchmarks are also pretty noisy (at least based on my experience), which is why I said that I thought it was worth keeping the builds in sync.

I try hard to be a good steward of perf for the default build: I always run benchmark runs for both builds for any changes that might affect perf and try to leave PRs open for long enough to allow someone from the faster-cpython team to have a look at them before they are merged (at times waiting months).

I also specifically wanted to give you a chance to comment, which is why I said in the PR summary:

We can restrict this change to only the free-threaded build (where it's clearly a win) if folks are concerned

Sorry you didn't have a chance to weigh in before this was merged. I'm happy to put up a PR with something like NO_INLINE_FT since you are concerned.

seehwan pushed a commit to seehwan/cpython that referenced this pull request Apr 16, 2025
…eter loop as noinline (python#132337)

Mark a few functions used by the interpreter loop as noinline

These are all the slow path and should not be inlined into the interpreter
loop. Unfortunately, they end up being inlined with LTO and the current PGO
task.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants