Skip to content

bpo-34270: added the ability to name asyncio tasks #8547

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Aug 8, 2018

Conversation

agronholm
Copy link
Contributor

@agronholm agronholm commented Jul 29, 2018

The ability to name individual tasks helps developers debug complex
asyncio applications.

Co-authored-by: Antti Haapala antti.haapala@anttipatterns.com

https://bugs.python.org/issue34270

The ability to name individual tasks helps developers debug complex
asyncio applications.

Co-authored-by: Antti Haapala <antti.haapala@anttipatterns.com>
@agronholm agronholm requested review from 1st1 and asvetlov as code owners July 29, 2018 11:41
@the-knights-who-say-ni
Copy link

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Unfortunately we couldn't find an account corresponding to your GitHub username on bugs.python.org (b.p.o) to verify you have signed the CLA (this might be simply due to a missing "GitHub Name" entry in your b.p.o account settings). This is necessary for legal reasons before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

When your account is ready, please add a comment in this pull request
and a Python core developer will remove the CLA not signed label
to make the bot check again.

You can check yourself
to see if the CLA has been received.

Thanks again for your contribution, we look forward to reviewing it!

@agronholm
Copy link
Contributor Author

I've updated my account to include my github name, so the CLA label should be cleared.

@1st1
Copy link
Member

1st1 commented Jul 29, 2018

I'll review this tomorrow

@ztane
Copy link
Contributor

ztane commented Jul 29, 2018

Co-author here (CLA signed)

Copy link
Member

@zooba zooba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait for Yury, but it looks good to me. One question about the need to convert names to str on assignment, but not a big deal.


@name.setter
def name(self, value):
self._name = str(value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is forcing this to a str really important enough to need a property? The name is basically only ever used for printing anyway, which will do the conversion when necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is consistent with the behavior of threading.Thread. Some code might also access the name property directly, expecting a str.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned above I don't think we need a setter. A Task.get_name() function is enough for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can set names of threads and processes too, why not tasks as well? Where's the harm in that? I can think of a few use cases for this, like displaying the status of a task as it progresses.
I could convert this to java-style setters and getters if you want, despite how much I despise them (not to mention the fact that threading also switched away from this behavior). But your call.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can think of a few use cases for this, like displaying the status of a task as it progresses.

Hm, interesting. OK, that's at least one valid use-case :)

I could convert this to java-style setters and getters if you want, despite how much I despise them (not to mention the fact that threading also switched away from this behavior). But your call.

It's not about liking Java-style getters or not, it's about consistency. If asyncio's APIs were using @property it'd be equally happy about that, but now we better follow the overall style and use get_name() and set_name().

@@ -23,6 +24,9 @@
from . import futures
from .coroutines import coroutine

# Helper to generate new task names
_counter = itertools.count(1).__next__
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using itertools.count() seems like a bit of an overkill to me. I'd simply do an explicit increment. Also, _task_name_counter would be a clearer name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed this seems like an overkill. There's no reason to make Task creation any slower than it has to be; keep in mind that PyPy uses the pure Python version. Just replace this with a simple int increment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what the threading module does, and it uses itertools.count() because _counter += 1 is not thread safe. The same applies to asyncio if there are more than one thread running an event loop (since the counter is global).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_counter += 1 is not thread safe.

It is thread safe in Python and will likely always be with or without the GIL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is thread safe in Python and will likely always be with or without the GIL.

What output does this script give you then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... thank you for pointing this out! ;)

To my point: += is actually atomic, but storing the updated number in globals is another operation and we have a race between them. This actually deserves a comment in the code explaining why we use counter() and not +=.

@@ -250,7 +250,7 @@ Futures
Tasks
-----

.. method:: AbstractEventLoop.create_task(coro)
.. method:: AbstractEventLoop.create_task(coro, \*, name=None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to update the documentation of set_task_factory

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

@@ -504,6 +516,12 @@ Task
get_stack(). The file argument is an I/O stream to which the output
is written; by default output is written to sys.stderr.

.. attribute:: name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I care way more about consistency of asyncio APIs than about maintaining any kind of artificial similarity between Task and Thread. And in asyncio we don't use properties that much. Please replace the name property with a Task.get_name() method. There's no use case for adding set_name() though, so let's not add it at least in 3.8.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.


if task._fut_waiter is not None:
info.insert(2, f'wait_for={task._fut_waiter!r}')
info.insert(3, f'wait_for={task._fut_waiter!r}')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, feel free to make another pull request to refactor this code. I don't like us modifying some list object with insert() calls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you prefer that we display the name of the task in repr() then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to refactor the code that formats things (tasks, callbacks, reprs) etc somehow. Right now it feels hacky and it's one of the areas of asyncio code that I'd like to be rewritten to be clearer. Discussing that is slightly out of the scope of this PR though, but if you have any ideas how to avoid mutating lists and having a non-linear formatting logic I'd be happy to accept a new PR. For this PR, your current code is totally fine!

@@ -23,6 +24,9 @@
from . import futures
from .coroutines import coroutine

# Helper to generate new task names
_counter = itertools.count(1).__next__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed this seems like an overkill. There's no reason to make Task creation any slower than it has to be; keep in mind that PyPy uses the pure Python version. Just replace this with a simple int increment.

@@ -104,6 +108,7 @@ def __init__(self, coro, *, loop=None):
self._log_destroy_pending = False
raise TypeError(f"a coroutine was expected, got {coro!r}")

self._name = str(name) if name is not None else 'Task-%s' % _counter()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit too hard to read. I'd replace it with an if statement.


@name.setter
def name(self, value):
self._name = str(value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned above I don't think we need a setter. A Task.get_name() function is enough for now.

@@ -37,6 +37,8 @@ static PyObject *context_kwname;
static PyObject *cached_running_holder;
static volatile uint64_t cached_running_holder_tsid;

/* Counter for autogenerated Task names */
static unsigned long long task_name_counter = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use uint64_t.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had that at first but @ztane pointed out that the format string explicitly requires an unsigned long long and uint64_t may or may not be the same size. Are you sure this is ok?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. AFAIK unsigned long long is guaranteed to be large enough to store uint64_t. Using the latter makes the code more consistent, so I'd go for it anyways.

You can use this format specifier to format uint64_t specifically:

printf("... %" PRIu64 "...", (uint64_t)1);

} else {
name = PyObject_Str(name);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this new line to make it easier for the reader to see that we handle the case of name==NULL

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@1st1
Copy link
Member

1st1 commented Jul 30, 2018

BTW, please also add a What's New entry for 3.8.

@asvetlov
Copy link
Contributor

I like the idea!

* Removed the public *name* property
* Added the *Task.get_name()* method
* Modified the documentation of *set_task_factory()* to include the
  *name* parameter
* Whitespace changes
@agronholm
Copy link
Contributor Author

I have made the requested changes; please review again.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@zooba, @1st1: please review the changes made to this pull request.

@agronholm
Copy link
Contributor Author

I have made the requested changes; please review again.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@zooba, @1st1: please review the changes made to this pull request.

@@ -104,6 +120,11 @@ def __init__(self, coro, *, loop=None):
self._log_destroy_pending = False
raise TypeError(f"a coroutine was expected, got {coro!r}")

if name is None:
self._name = 'Task-%s' % _task_name_counter()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I've recently updated asyncio to use f-strings. Please change this one to either an f-string or to a .format() call (just for maintaining consistency of the codebase).

@@ -813,6 +813,22 @@ def create_task(self, coro):
task._log_destroy_pending = False
coro.close()

def test_create_named_task_with_custom_factory(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add test_create_named_task_with_default_factory. That would test a bug we fixed in this PR when the default task factory didn't pass the name=name to Task().

@1st1
Copy link
Member

1st1 commented Aug 8, 2018

Almost there. Please fix the two remaining nit comments and I'll merge this.

Copy link
Member

@1st1 1st1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just spotted an error, a PyObject* function returns -1. Please also check that the C compiler generates 0 warnings for your patch in _asynciomodule.c.

{
PyObject *name = PyObject_Str(value);
if (name == NULL) {
return -1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oups, need to fix this to return NULL!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With make clean; CFLAGS=-Wno-cast-function-type make, compilation of the _asyncio extension produced no warnings after the patch.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@agronholm
Copy link
Contributor Author

I have made the requested changes; please review again.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@zooba, @1st1: please review the changes made to this pull request.

@1st1
Copy link
Member

1st1 commented Aug 8, 2018

Merged. Thanks for much for pushing this through, Alex!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants