Skip to content

gh-128002: use per threads tasks linked list in asyncio #128869

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Feb 6, 2025

Conversation

kumaraditya303
Copy link
Contributor

@kumaraditya303 kumaraditya303 commented Jan 15, 2025

Use per-thread linked list of tasks in asyncio. This design allows for lock free register/unregister of tasks of loops running concurrently in different threads. It uses the stop the world pause to traverse the list of tasks from all threads from the thread where all_tasks is called. This has no performance impact on regular builds as per benchmarks and performs a bit faster on free-threading benchmarks. pyperformance benchmarks aren't good for this because it uses just one thread so there is little lock contention, this however performs much better when multiple threads are running.

On free-threading:

Benchmark bm-20250111-linux-x86_64-python-3a570c6d58bd5ad7d7c1-3.14.0a3+-3a570c6 bm-20250112-linux-x86_64-kumaraditya303-per_thread_tasks-3.14.0a3+-cca4057
async_tree_cpu_io_mixed_tg 556 ms 536 ms: 1.04x faster
async_tree_none_tg 303 ms 294 ms: 1.03x faster
coroutines 26.2 ms 25.5 ms: 1.03x faster
async_tree_memoization_tg 397 ms 387 ms: 1.03x faster
async_tree_cpu_io_mixed 598 ms 583 ms: 1.03x faster
async_generators 498 ms 486 ms: 1.02x faster
async_tree_io 748 ms 733 ms: 1.02x faster
async_tree_io_tg 696 ms 682 ms: 1.02x faster
async_tree_memoization 442 ms 436 ms: 1.02x faster
async_tree_none 349 ms 344 ms: 1.01x faster
Geometric mean (ref) 1.02x faster

Benchmark hidden because not significant (1): asyncio_websockets

@1st1
Copy link
Member

1st1 commented Jan 15, 2025

First, great work on this, this is legitimately a cool PR.

That said, I'm feeling really uneasy about _PyEval_StopTheWorld. Maybe instead of this approach we try spin locks + a custom mini hash map data structure? That would obviously be a lot more work, but a cleaner and ultimately better perf approach.

Also please wait for reviews from @pablogsal and @ambv. I'm curious if this would make external introspection harder.

@graingert
Copy link
Contributor

Does this work where an event loop is used on one thread, stopped then resumed on another thread?

@graingert

This comment was marked as resolved.

@kumaraditya303
Copy link
Contributor Author

I have pushed some more changes:

  • Added a interpreter list of tasks which gets used when a task is alive but the thread state gets deallocated as suggested by @colesbury
  • Added thread id tracking in free-threading builds to avoid locking in the general case of loops running independently in threads without sharing tasks
  • Added some tests

TODO: benchmark it before merging

Copy link
Member

@pablogsal pablogsal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am not mistaken this solution seems incompatible with the asyncio introspecction workflow we are adding on #124640. Please, ensure that this change is compatible with the changes in that PR to avoid problems in the future.

@bedevere-app
Copy link

bedevere-app bot commented Jan 22, 2025

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@kumaraditya303
Copy link
Contributor Author

If I am not mistaken this solution seems incompatible with the asyncio introspecction workflow we are adding on #124640. Please, ensure that this change is compatible with the changes in that PR to avoid problems in the future.

This PR has nothing to do with asyncio introspection. As I said in other PR, the change which would affect that is moving current task to per-loop which isn't done in this PR.

@pablogsal
Copy link
Member

Hummm, I must have misread how this affects the task management. Let me dismiss my request for changes meanwhile. Thanks for the patience with this @kumaraditya303!

@kumaraditya303
Copy link
Contributor Author

@colesbury PTAL

Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, but would you please also get this reviewed by another asyncio expert?

add_tasks_interp(PyInterpreterState *interp, PyListObject *tasks)
{
#ifdef Py_GIL_DISABLED
assert(interp->stoptheworld.world_stopped);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻

@ambv ambv merged commit 0d68b14 into python:main Feb 6, 2025
41 checks passed
srinivasreddy pushed a commit to srinivasreddy/cpython that referenced this pull request Feb 7, 2025
@kumaraditya303 kumaraditya303 deleted the per-thread-tasks branch February 7, 2025 11:59
cmaloney pushed a commit to cmaloney/cpython that referenced this pull request Feb 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants