-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
gh-119154: Simplify consumers by making queue.Queue
an iterable
#120503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Can you find another reviewer? I’m just not in the right headspace right now. |
*.Queue.__iter__()
queue.Queue.__iter__()
cc @Zac-HD |
I'm not a CPython core dev, so I can't approve or merge your PRs 😅 We might want to document (and test) both the old-style and new-style approaches, but otherwise this looks good to me. |
Yes, I know. I just wanted your feedback. What to do with |
…3-37-45.gh-issue-120499.JK7Mv8.rst
@Zac-HD @nineteendo Picking up review for Guido. A few questions...
|
Quoting and expanding from #119154 for the motivation and benefit:
...but all that is about
I think the risks are pretty small, since it's a tightly-scoped change, although I'd aim to merge the |
queue.Queue.__iter__()
queue.Queue
an iterable
It has the same motivation as I agree with Zac that we should merge the |
Was waiting for Guido to opine on this but he is unavailable it looks like this falls to me. I going to decline this suggestion.
Thanks again for the suggestion but let's avoid feature creep when no significant new capability is being added. Minor respellings typically don't benefit users. In this case, I think it makes them worse-off being hiding two essential elements that should be explicit (the If Guido wants to reopen this, please go ahead. Otherwise, I think we're better off without it. |
That's an example for processing queues by daemon consumer threads. I left it in because Zac mentioned documentating both approaches:
If that example is actually rewritten using iteration, we don't need to use daemon threads. Without iterationimport threading
import queue
def worker(q):
while True:
item = q.get()
print(f'Working on {item}')
print(f'Finished {item}')
q.task_done()
q = queue.Queue()
threading.Thread(target=worker, args=(q,) daemon=True).start()
for item in range(30):
q.put(item)
q.join()
print('All work completed') # Worker is still running With iterationimport concurrent.futures
import queue
def worker(q):
for item in q:
print(f'Working on {item}')
print(f'Finished {item}')
q = queue.Queue()
with concurrent.futures.ThreadPoolExecutor() as tp:
tp.submit(worker, q)
for item in range(30):
q.put(item)
q.shutdown()
print('All work completed') # No worker is running
Processing a queue isn't always the last thing a program does, see cpython/Lib/test/test_queue.py Lines 155 to 181 in fd104df
This does get a lot simpler with this addition. |
@gvanrossum, did rhettinger change your mind? |
Also, rhettinger, only 24.5% are "simple" consumers, not the majority like you stated:
(I deleted the old message, as the search was too broad) Edit: removed "NOT" from the link preview. |
If Guido thinks this is a good API, I would be happy for him to reopen it. My personal judgment is that this makes users worse off but I may be wrong. I am clear that we don't have to do this. It is only combines a while-get with a try/except. There is no new capability being added. It is perfectly reasonable to wait to see how people use |
OK, and |
What would this do with an All the PR does is hide both a
Hiding the two steps makes users worse-off. Besides adding overhead, it hides a blocking call which in my expericience is something that needs to be explicit. Also, it distances the user from adding any other behavior in the except-clause, possibly making a log entry or finalizing a resource. And as noted above it, distances the user from adding timeouts and non-blocking behaviors which typically need special handling distinct from the case of a |
Maybe like this? This ensures the queue is not empty (without risking race conditions). That seems like the only sensible behaviour as queues aren't infinite (why would you otherwise want to iterate over them without waiting). def iter(self, block=True, timeout=None):
'''TODO'''
try:
yield self.get(block=block, timeout=timeout)
except ShutDown:
return
try:
while True:
yield self.get(block=block, timeout=timeout)
except (Empty, ShutDown):
return |
You can do that after the for loop: for ... in q.iter():
# do something
cleanup(q)
In that case, create a subclass.
It can be called when the user asks for the next item, but that wouldn't work if you put them in a list first. |
It's safer to call q.task_done() explicitly, see the discussion here: #119154. |
I am not the arbiter of APIs any more. I am currently on vacation and then dealing with medical things, so please find someone else. |
queue.Queue
an iterablequeue.Queue
an iterable
queue.Queue
an iterablequeue.Queue
an iterable
Purpose
Currently consuming items from a queue is very complex. You need to
q.task_done()
after processing each itemBy making
queue.Queue
an iterable, this becomes a lot easier, you only need to callq.shutdown()
.Overview of changes
queue.Queue.__iter__()
returns a generator which iterates over the queue of itemsExample
Without iteration
With iteration
Output
asyncio.Queue.__aiter__
#119154📚 Documentation preview 📚:
library/queue.html