-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
Make threading._register_atexit public? #86128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm dealing with a subtle deadlock involving concurrent.futures.ThreadPoolExecutor, and my solution that worked in Python 3.8 broke with 3.9. I'm running some long-running (possibly infinite) tasks in the thread pool, and I cancel them in an Python 3.9 broke this due to https://bugs.python.org/issue39812. That change introduced a new atexit-like mechanism to the threading module and uses it where Python 3.8 used regular atexit. This means that ThreadPoolExecutor's atexit runs before mine, and since I never get a chance to cancel my tasks, it deadlocks. One way I can solve this is to move my own atexit function to On the other hand, even without the change in Python 3.9, my use of One clean solution is to do the cancellation at the end of the main module instead of in an atexit hook. However, I'm doing this at a library so I don't have any way but atexit to ensure that this happens. Another option is to forego ThreadPoolExecutor entirely and manage the threads myself. My code in question is in a not-yet-released branch of Tornado: https://github.com/tornadoweb/tornado/blob/5913aa43ecfdaa76876fc57867062227b907b1dd/tornado/platform/asyncio.py#L57-L73 With the master branch of Tornado, Python 3.9, and Windows, |
IMO, a better practice would be providing those potentially infinite running tasks a direct method of escape and invoking it before calling executor.shutdown(), it would be a more reliable approach. But, perhaps there is some convenience utility in being able to provide custom atexit hooks. It also might help the user to separate the shutdown logic from the rest of the program. Since you worked with me in adding threading._register_atexit(), Do you have any thoughts, Antoine? I would personally not be opposed to it being made public assuming there's real utility present in doing so. My only concern is that it might be a potential foot-gun. If the user submits an atexit hook that deadlocks, it might prevent threads from shutting down safely prior to interpreter finalization. I'm presently undecided if explicitly mentioning that it in the docs would be sufficient warning. |
Agreed, but the problem is that I'm in a library (so I don't control the main module), and the library's interface does not mandate any sort of explicit shutdown method. There is a shutdown method, but almost no one calls it, and it's never caused a problem until Python 3.9 changed things so it deadlocks.
Yes, and that is exactly the problem. concurrent.futures submits an atexit hook whose behavior depends on application code, and through that I have inadvertently caused a deadlock. |
I have resolved my issue here by moving from ThreadPoolExecutor to a plain threading.Thread that I manage by hand (tornadoweb/tornado@15832bc). Therefore I no longer need this for myself and I leave it up to you to decide whether there's anything worth doing at this point. |
Another way to do this is to call threading.main_thread().join() in another thread and do the shutdown cleanup when it returns. The main thread is stopped at shutdown just before the threading._threading_atexits are called. |
To be clear, by "cancel" you are not talking about Future.cancel(). Rather, your handler causes all running tasks to finish (by sending a special message on the socket corresponding to each running task). Is that right? Some other things I inferred from your atexit handler:
Does all that sound right? Most of that is relevant to some possible solutions I have in mind. |
Alternately, perhaps ThreadPoolExecutor isn't the right fit here, as implied by the route you ended up going. It seems like it's not well-suited for long-running (or infinite) tasks. In that case, perhaps the concurrent.futures docs could be more clear about when ThreadPoolExecutor is not a good fit and what the alternatives are. |
FWIW, here's a brain dump about ThreadPoolExecutor and its atexit handler after having looked at the code. ---- First, the relationship between the objects involved:
Observations:
---- Next, let's look at the relevant ways the objects can be used:
Observations:
---- Regarding how tasks run:
Observations::
---- Now lets look more closely at the atexit handler.
The handler does the following:
It does not:
ThreadPoolExecutor.shutdown() basically does the same thing. However, it only does the steps above for its own tasks. It also optionally calls Future.cancel() for each queued task (right before step 2). However, all that does is keep any queued-but-not-running tasks from starting. Also, you can optionally skips step 4. |
(assuming we want to support long-running tasks here) With all the above in mind, there are a few things that may help. The first that comes to mind is to have the atexit handler call ThreadPoolExecutor.shutdown() for each instance. So something like this: def _python_exit():
global _shutdown
with _global_shutdown_lock:
_shutdown = True
for executor in list(_executors):
executor.shutdown() That would require a little refactoring to make it work. However, the change is simpler if each executor has its own atexit handler: class ThreadPoolExecutor(_base.Executor):
def __init__(self, ...):
...
threading._register_atexit(self._atexit())
def _atexit(self):
global _shutdown
_shutdown = True
self.shutdown() The value of either approach is that you can then subclass ThreadPoolExecutor to get what you want: class MyExecutor(ThreadPoolExecutor):
def shutdown(self, *args, **kwargs):
stop_my_tasks()
super().shutdown(*args, **kwwargs) One thing I thought about was supporting a per-task finalizer instead, since that aligns more closely with the way ThreadPoolExecutor works. It would only apply So something like one of the following:
---- Other things that could be helpful:
---- FWIW, adding support for some sort of sub-atexit handler isn't so appealing. I'm talking about something like one of the following:
(It would probably make sense to pass the list of currently running tasks to the callable.) |
Correct. My tasks here are calls to functions from the The select calls could be given a timeout so there is never an infinite task, but that's not ideal - a timeout that's too low has a performance cost as calls timeout and restart even when the system is "at rest", and a too-long timeout is still going to be perceived as a hanging application.
Correct. If the task were buggy it could still cause a deadlock. In my case the task is simple enough (a single selector call) that this is not a risk.
Each task is associated with a selector object (managing a set of sockets), not a single socket. There is only ever one task at a time; a task is enqueued only after the previous one finishes. (This thread pool is not used for any other purpose)
In my case this one-at-a-time rule means that the queue is always empty. But yes, in a more general solution you'd need some sort of interlock between cancelling existing tasks and starting new ones.
Yes, this is my conclusion as well. I filed this issue because I was frustrated that Python 3.9 broke previously-working code, but I'm willing to chalk this up to Hyrum's law and I'm not sure that this is something that ThreadPoolExecutor should be modified to support. |
Another issue (also related to long-running threads, though I don't think that's essential) is that there is no way to use an import atexit
import concurrent.futures
import time
stop_signal = False
def waiter():
while not stop_signal:
time.sleep(1)
print('.', end="")
import sys; sys.stdout.flush()
def stop_workers():
global stop_signal
print("called!")
stop_signal = True
atexit.register(stop_workers)
if __name__ == "__main__":
executor = concurrent.futures.ThreadPoolExecutor(max_workers=1)
executor.submit(waiter)
assert False When you run this, it will block and continue executing the ..^CError in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/usr/lib/python3.8/concurrent/futures/thread.py", line 40, in _python_exit
t.join()
File "/usr/lib/python3.8/threading.py", line 1011, in join
self._wait_for_tstate_lock()
File "/usr/lib/python3.8/threading.py", line 1027, in _wait_for_tstate_lock
elif lock.acquire(block, timeout):
KeyboardInterrupt
called! So there's effectively a deadlock between the I feel like an inability to tell threads to gracefully exit when the main thread ends is a pretty significant limitation of Note: the same issue seems to occur on python 3.10, so it's not specific to 3.8. |
See python#86128 (comment) for an example where this happens. Essentially, even if it looks like you added an `atexit` handler to instruct your thread to exit gracefully, it will only be executed _after_ your thread has finished. For long-running threads (e.g. threads listening to a queue), that may never happen at all. Elsewhere in python#86128, it's recommended that `ThreadPoolExecutor` not be used for long-running tasks, but this was not reflected in the documentation. Based solely on the API, there is no reason to think you shouldn't use it for long-running tasks. The only reason appears to be a limitation in its implementation, so that should be made explicit in the docs.
…ehaviour (pythonGH-94008) (cherry picked from commit 7df2f4d) Co-authored-by: [object Object] <lucas.wiman@gmail.com>
…ehaviour (pythonGH-94008) (cherry picked from commit 7df2f4d) Co-authored-by: [object Object] <lucas.wiman@gmail.com>
How much of this is actually specific to |
@SyntaxColoring Prior to 3.9, I think your example is expected behavior for non-daemon threads, though the threading/atexit docs could probably be clarified a bit on this point. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: