Skip to content

GH-91048: Add utils for capturing async call stack for asyncio programs and enable profiling #124640

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 88 commits into from
Jan 22, 2025

Conversation

1st1
Copy link
Member

@1st1 1st1 commented Sep 26, 2024

This PR introduces low overhead utils to enable async stack reconstruction for running and suspended tasks and futures. The runtime overhead this PR adds is making tasks set & reset a reference to future/tasks they await on. Control flow primitives like asyncio.TaskGroup, asyncio.gather(), asyncio.shield() are also updated to perform this simple bookkeeping.

Remaining to-do for @pablogsal, @ambv, and myself

  • swap_current_task in asynciomodule.c needs to be updated to take care of the new ts->asyncio_running_task
  • Add a few tests for eager tasks and how they would interact with this (e.g. running only eager tasks, running eager task within a task, running a task from an eager task from a task, etc.)
  • test_running_loop_within_a_loop seg faults if the test is ran twice (I think), a repro is running ./python.exe -m test test_asyncio -R3:3
  • -R3:3 reports refleaks , might be nothing, but needs to be checked (OK, definitely looks like there's a leak "test_asyncio.test_unix_events leaked [492, 492, 492] references, sum=1476", repro ./python.exe -m test test_asyncio.test_subprocess test_asyncio.test_taskgroups -R3:3)
  • Run full perf test again, we might have other regressions besides the already fixed gather()
  • Cover both Python and C implementations (right now only C is being tested)

New APIs:

  • asyncio.capture_call_graph(*, future=None, depth=1) to capture the call stack for the current task or for the specified task or future
  • asyncio.print_call_graph(*, future=None, file=None, depth=1) to print the call stack for the current task or for the specified task or future
  • asyncio.future_add_to_awaited_by() and asyncio.future_discard_from_awaited_by() to enable "stitching" tasks and futures that are awaiting other tasks and futures.
  • frame.f_generator a getter to return the generator object associated with the frame or None. The implementation is maximally straightforward and does not require any new fields in any of the internal interpreter structs.

New C APIs:

  • Coroutine and generator structs gain a new pointer cr_task. It is a borrowed pointer to the task that runs the coroutine. This is meant for external profilers to quickly find the associated task (an alternative to this would be a rather costly traversal of interpreter state to find the module state of asyncio, requiring to read many Python dict structs).
  • A "private" C API function void _PyCoro_SetTask(PyObject *coro, PyObject *task) to set the cr_task field from _asynciomodule.c.

We have a complete functional test for out of process introspection in test_external_inspection.py (sampling profilers and debuggers will follow the same approach).

Example

This example program:

async def deep():
    await asyncio.sleep(0)
    import pprint
    pprint.pp(asyncio.capture_call_graph())

async def c1():
    await asyncio.sleep(0)
    await deep()

async def c2():
    await asyncio.sleep(0)

async def main():
    await asyncio.gather(c1(), c2())

asyncio.run(main())

will print:

FrameCallGraphEntry(
    future=<Task pending name='Task-2' ...>, 
    call_stack=[
        FrameCallGraphEntry(
            frame=..., 
        FrameCallGraphEntry(
            frame=...)
    ], 
    awaited_by=[
        FrameCallGraphEntry(
            future=<Task pending name='Task-1' ...>, 
            call_stack=[
                FrameCallGraphEntry(
                    frame=...)
            ], 
            awaited_by=[]
        )
    ]
)

📚 Documentation preview 📚: https://cpython-previews--124640.org.readthedocs.build/

@1st1
Copy link
Member Author

1st1 commented Sep 27, 2024

I've implemented proof of concept out of process async stack reconstruction here: 1st1@7185f8d

@pablogsal pablogsal force-pushed the stack branch 3 times, most recently from 760ad58 to 4f7bf44 Compare September 30, 2024 11:28
Copy link
Contributor

@mpage mpage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this! Do you know how much overhead the awaiter tracking adds?

@1st1
Copy link
Member Author

1st1 commented Oct 2, 2024

@mpage

Thanks for doing this! Do you know how much overhead the awaiter tracking adds?

I haven't measured, but I don't expect the overhead to be detectable at all. In the most common scenario the tracking is just assigning / resetting a pointer in Task/Future C structs.

@pablogsal
Copy link
Member

@mpage

Thanks for doing this! Do you know how much overhead the awaiter tracking adds?

I haven't measured, but I don't expect the overhead to be detectable at all. In the most common scenario the tracking is just assigning / resetting a pointer in Task/Future C structs.

I ran some async benchmarks from pyperformance and a small echo tcp server and the Perf impact is below the noise.

import dataclasses
import sys
import types
import typing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't we remove typing imports recently from asyncio to speed-up import time? Because I think importing asyncio would now also import typing since we do from .graph import * in asyncio.__init__. I think we should let typeshed the responsibility of having the type hints.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just rebased this PR, so the current source may be outdated with other changes we did elsewhere

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First off, I don't think the crusade to remove typing imports from the standard library makes sense. Any non-trivial application will import typing anyway through some third-party dependency. It's a lot of effort for a very brittle outcome.

As for this particular import, we need it to type a file. I'll see what I can do to avoid the import.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well.. whether it makes sense or not is a legitimate question, but since something was recently approved to explicitly removed typing from asyncio imports, I thought it would have been better not to implicitly revert it (7363476).

@pablogsal
Copy link
Member

pablogsal commented Jan 22, 2025

@1st1 @ambv I fixed the crashes, refleaks and I did one of the most painful rebases ever, so please let get this merged as soon as possible to avoid collisions with the other work going on :)

@kumaraditya303
Copy link
Contributor

There have been some changes in asyncio which are relevant for this:

  • The C implementation is thread safe so new code should ideally be thread safe as well, thread safety is superficial here as tasks aren't thread safe really, it just shouldn't crash, mostly adding @critical_section to getter and methods would make it work.
  • Add object locked assertion to any internal method if necessary
  • There have been discussions about moving current task to per loop for free-threading, that might break the external introspection in future.

I haven't followed on this PR so not sure how many of those points matter here, but still I wrote it incase anyone else missed them.

@pablogsal
Copy link
Member

  • There have been discussions about moving current task to per loop for free-threading, that might break the external introspection in future.

Then we should find some solution that doesn't break the external introspection in the future ;)

@pablogsal
Copy link
Member

  • The C implementation is thread safe so new code should ideally be thread safe as well, thread safety is superficial here as tasks aren't thread safe really, it just shouldn't crash, mostly adding @critical_section to getter and methods would make it work.
  • Add object locked assertion to any internal method if necessary

This is slightly tricky and if you recall we removed the locks we added at your request. This PR is already unwieldy complicated so I would prefer if we work together in a separate PR to ensure the lock safety is addressed as you would like it

@kumaraditya303
Copy link
Contributor

This is slightly tricky and if you recall we removed the locks we added at your request. This PR is already unwieldy complicated so I would prefer if we work together in a separate PR to ensure the lock safety is addressed as you would like it

Yes, I do remember but it was long time ago 2-3 months, I have made significant changes related to free-threading since that time. It's fine by me to work on free-threading later and merge this first "as-is".

@ambv ambv merged commit 1885988 into python:main Jan 22, 2025
44 of 45 checks passed
@ambv
Copy link
Contributor

ambv commented Jan 22, 2025

Thank you, Kumar. I landed this feature. We've received a ton of review here, definitely more than most changes. We addressed all feedback on design, correctness, and performance. The long-lived branch is hard to keep alive for longer. Thank you to everyone involved, let's evolve this forward from the main branch.

We'll help with #128869 and future free-threading compatibility to make sure external introspection keeps working. As part of the free-threading compatibility of this particular feature, we'll address this in a subsequent PR soon, where we will also add support for eager task external introspection. The new tests added by Kumar in January show some interesting behavior of eager tasks, I'm looking into that. Will provide more information on the subsequent PR.

ambv added a commit to ambv/cpython that referenced this pull request Jan 22, 2025
…r tasks

This was missing from pythongh-124640. It's already covered by the new
test_asyncio/test_free_threading.py in combination with the runtime
assertion in set_ts_asyncio_running_task.
@picnixz
Copy link
Member

picnixz commented Jan 23, 2025

Not sure if this is directly related, but we're seeing build bot failures now: https://buildbot.python.org/#/builders/125/builds/6894

test_async_gather_remote_stack_trace (test.test_external_inspection.TestGetStackTrace.test_async_gather_remote_stack_trace) ... Fatal Python error: _Py_CheckFunctionResult: a function returned NULL without setting an exception

@pablogsal
Copy link
Member

Not sure if this is directly related, but we're seeing build bot failures now: https://buildbot.python.org/#/builders/125/builds/6894

test_async_gather_remote_stack_trace (test.test_external_inspection.TestGetStackTrace.test_async_gather_remote_stack_trace) ... Fatal Python error: _Py_CheckFunctionResult: a function returned NULL without setting an exception

On it

@vstinner
Copy link
Member

See also my comment #129189 (comment)

ambv added a commit that referenced this pull request Jan 23, 2025
#129197)

This was missing from gh-124640. It's already covered by the new
test_asyncio/test_free_threading.py in combination with the runtime
assertion in set_ts_asyncio_running_task.

Co-authored-by: Kumar Aditya <kumaraditya@python.org>
@encukou
Copy link
Member

encukou commented Jan 24, 2025

@pablogsal @ambv, this broke several tier-1 buildbots for more than a day so, by the book, it should be reverted. But, I also know there are issues with a buildbot update currently, and reverting & reapplying would be a lot churn in this case.

I'll leave it to you to handle this one.

FWIW, each working day I check there are no other buildbot failures, but I'll be offline during the weekend.

@vstinner
Copy link
Member

I wrote a PR to fix the test: #129262

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.