-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
GH-91048: Add utils for capturing async call stack for asyncio programs and enable profiling #124640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I've implemented proof of concept out of process async stack reconstruction here: 1st1@7185f8d |
760ad58
to
4f7bf44
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this! Do you know how much overhead the awaiter tracking adds?
I haven't measured, but I don't expect the overhead to be detectable at all. In the most common scenario the tracking is just assigning / resetting a pointer in Task/Future C structs. |
I ran some async benchmarks from pyperformance and a small echo tcp server and the Perf impact is below the noise. |
`f_generator` returns the generator / coroutine / async generator object that owns the frame. For all other kinds of frames it will return `None`. This is useful to reconstruct call stack for async/await code.
Lib/asyncio/graph.py
Outdated
import dataclasses | ||
import sys | ||
import types | ||
import typing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't we remove typing imports recently from asyncio
to speed-up import time? Because I think importing asyncio
would now also import typing
since we do from .graph import *
in asyncio.__init__
. I think we should let typeshed the responsibility of having the type hints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just rebased this PR, so the current source may be outdated with other changes we did elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First off, I don't think the crusade to remove typing imports from the standard library makes sense. Any non-trivial application will import typing anyway through some third-party dependency. It's a lot of effort for a very brittle outcome.
As for this particular import, we need it to type a file. I'll see what I can do to avoid the import.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well.. whether it makes sense or not is a legitimate question, but since something was recently approved to explicitly removed typing
from asyncio imports, I thought it would have been better not to implicitly revert it (7363476).
There have been some changes in asyncio which are relevant for this:
I haven't followed on this PR so not sure how many of those points matter here, but still I wrote it incase anyone else missed them. |
Then we should find some solution that doesn't break the external introspection in the future ;) |
This is slightly tricky and if you recall we removed the locks we added at your request. This PR is already unwieldy complicated so I would prefer if we work together in a separate PR to ensure the lock safety is addressed as you would like it |
Yes, I do remember but it was long time ago 2-3 months, I have made significant changes related to free-threading since that time. It's fine by me to work on free-threading later and merge this first "as-is". |
Thank you, Kumar. I landed this feature. We've received a ton of review here, definitely more than most changes. We addressed all feedback on design, correctness, and performance. The long-lived branch is hard to keep alive for longer. Thank you to everyone involved, let's evolve this forward from the main branch. We'll help with #128869 and future free-threading compatibility to make sure external introspection keeps working. As part of the free-threading compatibility of this particular feature, we'll address this in a subsequent PR soon, where we will also add support for eager task external introspection. The new tests added by Kumar in January show some interesting behavior of eager tasks, I'm looking into that. Will provide more information on the subsequent PR. |
…r tasks This was missing from pythongh-124640. It's already covered by the new test_asyncio/test_free_threading.py in combination with the runtime assertion in set_ts_asyncio_running_task.
Not sure if this is directly related, but we're seeing build bot failures now: https://buildbot.python.org/#/builders/125/builds/6894
|
On it |
See also my comment #129189 (comment) |
@pablogsal @ambv, this broke several tier-1 buildbots for more than a day so, by the book, it should be reverted. But, I also know there are issues with a buildbot update currently, and reverting & reapplying would be a lot churn in this case. I'll leave it to you to handle this one. FWIW, each working day I check there are no other buildbot failures, but I'll be offline during the weekend. |
I wrote a PR to fix the test: #129262 |
This PR introduces low overhead utils to enable async stack reconstruction for running and suspended tasks and futures. The runtime overhead this PR adds is making tasks set & reset a reference to future/tasks they await on. Control flow primitives like
asyncio.TaskGroup
,asyncio.gather()
,asyncio.shield()
are also updated to perform this simple bookkeeping.Remaining to-do for @pablogsal, @ambv, and myself
swap_current_task
inasynciomodule.c
needs to be updated to take care of the newts->asyncio_running_task
test_running_loop_within_a_loop
seg faults if the test is ran twice (I think), a repro is running./python.exe -m test test_asyncio -R3:3
-R3:3
reports refleaks, might be nothing, but needs to be checked(OK, definitely looks like there's a leak "test_asyncio.test_unix_events leaked [492, 492, 492] references, sum=1476", repro./python.exe -m test test_asyncio.test_subprocess test_asyncio.test_taskgroups -R3:3
)gather()
New APIs:
asyncio.capture_call_graph(*, future=None, depth=1)
to capture the call stack for the current task or for the specified task or futureasyncio.print_call_graph(*, future=None, file=None, depth=1)
to print the call stack for the current task or for the specified task or futureasyncio.future_add_to_awaited_by()
andasyncio.future_discard_from_awaited_by()
to enable "stitching" tasks and futures that are awaiting other tasks and futures.frame.f_generator
a getter to return the generator object associated with the frame orNone
. The implementation is maximally straightforward and does not require any new fields in any of the internal interpreter structs.New C APIs:
Coroutine and generator structs gain a new pointercr_task
. It is a borrowed pointer to the task that runs the coroutine. This is meant for external profilers to quickly find the associated task (an alternative to this would be a rather costly traversal of interpreter state to find the module state of asyncio, requiring to read many Pythondict
structs).A "private" C API functionvoid _PyCoro_SetTask(PyObject *coro, PyObject *task)
to set thecr_task
field from_asynciomodule.c
.We have a complete functional test for out of process introspection in
test_external_inspection.py
(sampling profilers and debuggers will follow the same approach).Example
This example program:
will print:
📚 Documentation preview 📚: https://cpython-previews--124640.org.readthedocs.build/