-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
gh-86128: Add warning to ThreadPoolExecutor docs #94008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-86128: Add warning to ThreadPoolExecutor docs #94008
Conversation
Most changes to Python require a NEWS entry. Please add it using the blurb_it web app or the blurb command-line tool. |
See python#86128 (comment) for an example where this happens. Essentially, even if it looks like you added an `atexit` handler to instruct your thread to exit gracefully, it will only be executed _after_ your thread has finished. For long-running threads (e.g. threads listening to a queue), that may never happen at all. Elsewhere in python#86128, it's recommended that `ThreadPoolExecutor` not be used for long-running tasks, but this was not reflected in the documentation. Based solely on the API, there is no reason to think you shouldn't use it for long-running tasks. The only reason appears to be a limitation in its implementation, so that should be made explicit in the docs.
4304fc6
to
2bde132
Compare
(Note that while this is related to #86128, it does not fix that issue, merely documenting the behavior.) |
Thanks @lucaswiman for the PR, and @zooba for merging it 🌮🎉.. I'm working now to backport this PR to: 3.10. |
Thanks @lucaswiman for the PR, and @zooba for merging it 🌮🎉.. I'm working now to backport this PR to: 3.11. |
GH-95404 is a backport of this pull request to the 3.10 branch. |
…ehaviour (pythonGH-94008) (cherry picked from commit 7df2f4d) Co-authored-by: [object Object] <lucas.wiman@gmail.com>
…ehaviour (pythonGH-94008) (cherry picked from commit 7df2f4d) Co-authored-by: [object Object] <lucas.wiman@gmail.com>
GH-95405 is a backport of this pull request to the 3.11 branch. |
executed *before* any exit handlers added using `atexit`. This means | ||
exceptions in the main thread must be caught and handled in order to | ||
signal threads to exit gracefully. For this reason, it is recommended | ||
that ``ThreadPoolExecutor`` not be used for long-running tasks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are still safe ways to use a ThreadPoolExecutor for long running tasks:
- as long as you use the ThreadPoolExecutor context manager,
- keep hold of a function to promptly abort the long-running tasks,
- and arrange for that function to be called before leaving the ThreadPoolExecutor context manager
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but it's only a recommendation to not use it, which means if you know what you're doing then you can ignore the recommendation 😉 The docs already explain what conditions have to be met to make it safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@graingert It's discussed further in the issue I linked to, see @ericsnowcurrently's comment here: #86128 (comment)
Having dived into the source code a fair amount while debugging the issue I ran into, you're right that it can be used for long-running tasks. However one common way of handling graceful exit does not work, in a way that was so surprising that it took me hours to debug. Using a context manager would have required refactoring the code, while using threading.Thread
directly did not. So I think it's worth a warning.
Note that none of the examples in the docs handle this tricky case in the way you suggest. That might be a good addition to the docs. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd definitely like to add this as an example/recipe.
My use case is, I'm using ThreadPoolExecutor as a synchronous structured concurrency primative where I manage cancellation myself.
This is in distributed.utils_test.loop_in_thread and anyio's BlockingPortal
My qualms about adding this as an example to the stdlib is that to do this I'm using c.f.Future in a way that's documented as incorrect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a really fuzzy line between "it is recommended that ThreadPoolExecutor
not be used for long-running tasks because of this tricky case" and "this tricky case means that to use ThreadPoolExecutor
for long-running tasks, you should do it like this ...
" It seems like a small change to me if you add the example.
I was basing the recommendation off of (1) a core contributor saying that it's not a good fit for long-running tasks, and (2) my personal experience with confusing issues in using it for long-running tasks. I tend to agree with @zooba that a recommendation against is not the same as saying that it's incorrect. (As a contrived example, you might recommend against using cpython for tasks that cannot handle gc pauses, but someone might do it anyway and call gc.disable()
to avoid the problem. That can cause other, well-documented problems, but presumably they know what they're doing.)
ThreadPoolExecutor has unusual atexit behavior that can cause threads to block indefinitely. This PR documents that limitation, and adds an explicit warning to the documentation for ThreadPoolExecutor.
See #86128 (comment) for an example where this happens. Essentially, even if it looks like you added an
atexit
handler to instruct your thread to exit gracefully, it will only be executed after your thread has finished. For long-running threads (e.g. threads listening to a queue), that may never happen at all.Elsewhere in #86128, it's recommended that
ThreadPoolExecutor
not be used for long-running tasks, but this was not reflected in the documentation. Based solely on the API, there is no reason to think you shouldn't use it for long-running tasks. The only reason appears to be a limitation in its implementation, so that should be made explicit in the docs.Thanks very much reviewing!
I'm also open to attempting to fix the underlying odd behavior, though I don't think I have enough context to know what the right fix would be.