-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
bpo-34270: added the ability to name asyncio tasks #8547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The ability to name individual tasks helps developers debug complex asyncio applications. Co-authored-by: Antti Haapala <antti.haapala@anttipatterns.com>
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA). Unfortunately we couldn't find an account corresponding to your GitHub username on bugs.python.org (b.p.o) to verify you have signed the CLA (this might be simply due to a missing "GitHub Name" entry in your b.p.o account settings). This is necessary for legal reasons before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. When your account is ready, please add a comment in this pull request You can check yourself Thanks again for your contribution, we look forward to reviewing it! |
I've updated my account to include my github name, so the CLA label should be cleared. |
I'll review this tomorrow |
Co-author here (CLA signed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait for Yury, but it looks good to me. One question about the need to convert names to str on assignment, but not a big deal.
Lib/asyncio/tasks.py
Outdated
|
||
@name.setter | ||
def name(self, value): | ||
self._name = str(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is forcing this to a str
really important enough to need a property? The name is basically only ever used for printing anyway, which will do the conversion when necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is consistent with the behavior of threading.Thread
. Some code might also access the name
property directly, expecting a str
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned above I don't think we need a setter. A Task.get_name()
function is enough for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can set names of threads and processes too, why not tasks as well? Where's the harm in that? I can think of a few use cases for this, like displaying the status of a task as it progresses.
I could convert this to java-style setters and getters if you want, despite how much I despise them (not to mention the fact that threading
also switched away from this behavior). But your call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can think of a few use cases for this, like displaying the status of a task as it progresses.
Hm, interesting. OK, that's at least one valid use-case :)
I could convert this to java-style setters and getters if you want, despite how much I despise them (not to mention the fact that threading also switched away from this behavior). But your call.
It's not about liking Java-style getters or not, it's about consistency. If asyncio's APIs were using @property
it'd be equally happy about that, but now we better follow the overall style and use get_name()
and set_name()
.
Lib/asyncio/tasks.py
Outdated
@@ -23,6 +24,9 @@ | |||
from . import futures | |||
from .coroutines import coroutine | |||
|
|||
# Helper to generate new task names | |||
_counter = itertools.count(1).__next__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using itertools.count()
seems like a bit of an overkill to me. I'd simply do an explicit increment. Also, _task_name_counter
would be a clearer name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed this seems like an overkill. There's no reason to make Task creation any slower than it has to be; keep in mind that PyPy uses the pure Python version. Just replace this with a simple int increment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what the threading module does, and it uses itertools.count()
because _counter += 1
is not thread safe. The same applies to asyncio if there are more than one thread running an event loop (since the counter is global).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_counter += 1 is not thread safe.
It is thread safe in Python and will likely always be with or without the GIL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is thread safe in Python and will likely always be with or without the GIL.
What output does this script give you then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm... thank you for pointing this out! ;)
To my point: +=
is actually atomic, but storing the updated number in globals is another operation and we have a race between them. This actually deserves a comment in the code explaining why we use counter()
and not +=
.
@@ -250,7 +250,7 @@ Futures | |||
Tasks | |||
----- | |||
|
|||
.. method:: AbstractEventLoop.create_task(coro) | |||
.. method:: AbstractEventLoop.create_task(coro, \*, name=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to update the documentation of set_task_factory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
Doc/library/asyncio-task.rst
Outdated
@@ -504,6 +516,12 @@ Task | |||
get_stack(). The file argument is an I/O stream to which the output | |||
is written; by default output is written to sys.stderr. | |||
|
|||
.. attribute:: name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I care way more about consistency of asyncio APIs than about maintaining any kind of artificial similarity between Task and Thread. And in asyncio we don't use properties that much. Please replace the name
property with a Task.get_name()
method. There's no use case for adding set_name()
though, so let's not add it at least in 3.8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
|
||
if task._fut_waiter is not None: | ||
info.insert(2, f'wait_for={task._fut_waiter!r}') | ||
info.insert(3, f'wait_for={task._fut_waiter!r}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, feel free to make another pull request to refactor this code. I don't like us modifying some list object with insert()
calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would you prefer that we display the name of the task in repr()
then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to refactor the code that formats things (tasks, callbacks, reprs) etc somehow. Right now it feels hacky and it's one of the areas of asyncio code that I'd like to be rewritten to be clearer. Discussing that is slightly out of the scope of this PR though, but if you have any ideas how to avoid mutating lists and having a non-linear formatting logic I'd be happy to accept a new PR. For this PR, your current code is totally fine!
Lib/asyncio/tasks.py
Outdated
@@ -23,6 +24,9 @@ | |||
from . import futures | |||
from .coroutines import coroutine | |||
|
|||
# Helper to generate new task names | |||
_counter = itertools.count(1).__next__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed this seems like an overkill. There's no reason to make Task creation any slower than it has to be; keep in mind that PyPy uses the pure Python version. Just replace this with a simple int increment.
Lib/asyncio/tasks.py
Outdated
@@ -104,6 +108,7 @@ def __init__(self, coro, *, loop=None): | |||
self._log_destroy_pending = False | |||
raise TypeError(f"a coroutine was expected, got {coro!r}") | |||
|
|||
self._name = str(name) if name is not None else 'Task-%s' % _counter() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit too hard to read. I'd replace it with an if
statement.
Lib/asyncio/tasks.py
Outdated
|
||
@name.setter | ||
def name(self, value): | ||
self._name = str(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned above I don't think we need a setter. A Task.get_name()
function is enough for now.
Modules/_asynciomodule.c
Outdated
@@ -37,6 +37,8 @@ static PyObject *context_kwname; | |||
static PyObject *cached_running_holder; | |||
static volatile uint64_t cached_running_holder_tsid; | |||
|
|||
/* Counter for autogenerated Task names */ | |||
static unsigned long long task_name_counter = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use uint64_t
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had that at first but @ztane pointed out that the format string explicitly requires an unsigned long long
and uint64_t
may or may not be the same size. Are you sure this is ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. AFAIK unsigned long long
is guaranteed to be large enough to store uint64_t
. Using the latter makes the code more consistent, so I'd go for it anyways.
You can use this format specifier to format uint64_t
specifically:
printf("... %" PRIu64 "...", (uint64_t)1);
Modules/_asynciomodule.c
Outdated
} else { | ||
name = PyObject_Str(name); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this new line to make it easier for the reader to see that we handle the case of name==NULL
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
BTW, please also add a What's New entry for 3.8. |
I like the idea! |
* Removed the public *name* property * Added the *Task.get_name()* method * Modified the documentation of *set_task_factory()* to include the *name* parameter * Whitespace changes
I have made the requested changes; please review again. |
I have made the requested changes; please review again. |
Lib/asyncio/tasks.py
Outdated
@@ -104,6 +120,11 @@ def __init__(self, coro, *, loop=None): | |||
self._log_destroy_pending = False | |||
raise TypeError(f"a coroutine was expected, got {coro!r}") | |||
|
|||
if name is None: | |||
self._name = 'Task-%s' % _task_name_counter() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I've recently updated asyncio to use f-strings. Please change this one to either an f-string or to a .format()
call (just for maintaining consistency of the codebase).
@@ -813,6 +813,22 @@ def create_task(self, coro): | |||
task._log_destroy_pending = False | |||
coro.close() | |||
|
|||
def test_create_named_task_with_custom_factory(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add test_create_named_task_with_default_factory
. That would test a bug we fixed in this PR when the default task factory didn't pass the name=name
to Task()
.
Almost there. Please fix the two remaining nit comments and I'll merge this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just spotted an error, a PyObject*
function returns -1
. Please also check that the C compiler generates 0 warnings for your patch in _asynciomodule.c
.
Modules/_asynciomodule.c
Outdated
{ | ||
PyObject *name = PyObject_Str(value); | ||
if (name == NULL) { | ||
return -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oups, need to fix this to return NULL
!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With make clean; CFLAGS=-Wno-cast-function-type make
, compilation of the _asyncio
extension produced no warnings after the patch.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
I have made the requested changes; please review again. |
Merged. Thanks for much for pushing this through, Alex! |
The ability to name individual tasks helps developers debug complex
asyncio applications.
Co-authored-by: Antti Haapala antti.haapala@anttipatterns.com
https://bugs.python.org/issue34270