Skip to content

gh-86128: Add warning to ThreadPoolExecutor docs #94008

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 28, 2022

Conversation

lucaswiman
Copy link
Contributor

ThreadPoolExecutor has unusual atexit behavior that can cause threads to block indefinitely. This PR documents that limitation, and adds an explicit warning to the documentation for ThreadPoolExecutor.

See #86128 (comment) for an example where this happens. Essentially, even if it looks like you added an atexit handler to instruct your thread to exit gracefully, it will only be executed after your thread has finished. For long-running threads (e.g. threads listening to a queue), that may never happen at all.

Elsewhere in #86128, it's recommended that ThreadPoolExecutor not be used for long-running tasks, but this was not reflected in the documentation. Based solely on the API, there is no reason to think you shouldn't use it for long-running tasks. The only reason appears to be a limitation in its implementation, so that should be made explicit in the docs.


Thanks very much reviewing!

I'm also open to attempting to fix the underlying odd behavior, though I don't think I have enough context to know what the right fix would be.

@ghost
Copy link

ghost commented Jun 19, 2022

All commit authors signed the Contributor License Agreement.
CLA signed

@bedevere-bot
Copy link

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

@bedevere-bot bedevere-bot added the docs Documentation in the Doc dir label Jun 19, 2022
lucaswiman and others added 2 commits June 19, 2022 11:22
See python#86128 (comment) for
an example where this happens. Essentially, even if it looks like
you added an `atexit` handler to instruct your thread to exit gracefully,
it will only be executed _after_ your thread has finished. For
long-running threads (e.g. threads listening to a queue), that may
never happen at all.

Elsewhere in python#86128, it's recommended that `ThreadPoolExecutor` not
be used for long-running tasks, but this was not reflected in the
documentation. Based solely on the API, there is no reason to think
you shouldn't use it for long-running tasks. The only reason appears
to be a limitation in its implementation, so that should be made
explicit in the docs.
@lucaswiman lucaswiman force-pushed the document-ThreadPoolExecutor-limitation branch from 4304fc6 to 2bde132 Compare June 19, 2022 18:22
@lucaswiman
Copy link
Contributor Author

(Note that while this is related to #86128, it does not fix that issue, merely documenting the behavior.)

@zooba zooba merged commit 7df2f4d into python:main Jul 28, 2022
@zooba zooba added needs backport to 3.10 only security fixes needs backport to 3.11 only security fixes labels Jul 28, 2022
@miss-islington
Copy link
Contributor

Thanks @lucaswiman for the PR, and @zooba for merging it 🌮🎉.. I'm working now to backport this PR to: 3.10.
🐍🍒⛏🤖

@miss-islington
Copy link
Contributor

Thanks @lucaswiman for the PR, and @zooba for merging it 🌮🎉.. I'm working now to backport this PR to: 3.11.
🐍🍒⛏🤖

@bedevere-bot
Copy link

GH-95404 is a backport of this pull request to the 3.10 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.10 only security fixes label Jul 28, 2022
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jul 28, 2022
…ehaviour (pythonGH-94008)

(cherry picked from commit 7df2f4d)

Co-authored-by: [object Object] <lucas.wiman@gmail.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jul 28, 2022
…ehaviour (pythonGH-94008)

(cherry picked from commit 7df2f4d)

Co-authored-by: [object Object] <lucas.wiman@gmail.com>
@bedevere-bot
Copy link

GH-95405 is a backport of this pull request to the 3.11 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.11 only security fixes label Jul 28, 2022
miss-islington added a commit that referenced this pull request Jul 28, 2022
…ur (GH-94008)

(cherry picked from commit 7df2f4d)

Co-authored-by: [object Object] <lucas.wiman@gmail.com>
miss-islington added a commit that referenced this pull request Jul 28, 2022
…ur (GH-94008)

(cherry picked from commit 7df2f4d)

Co-authored-by: [object Object] <lucas.wiman@gmail.com>
executed *before* any exit handlers added using `atexit`. This means
exceptions in the main thread must be caught and handled in order to
signal threads to exit gracefully. For this reason, it is recommended
that ``ThreadPoolExecutor`` not be used for long-running tasks.
Copy link
Contributor

@graingert graingert Aug 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are still safe ways to use a ThreadPoolExecutor for long running tasks:

  • as long as you use the ThreadPoolExecutor context manager,
  • keep hold of a function to promptly abort the long-running tasks,
  • and arrange for that function to be called before leaving the ThreadPoolExecutor context manager

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but it's only a recommendation to not use it, which means if you know what you're doing then you can ignore the recommendation 😉 The docs already explain what conditions have to be met to make it safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@graingert It's discussed further in the issue I linked to, see @ericsnowcurrently's comment here: #86128 (comment)

Having dived into the source code a fair amount while debugging the issue I ran into, you're right that it can be used for long-running tasks. However one common way of handling graceful exit does not work, in a way that was so surprising that it took me hours to debug. Using a context manager would have required refactoring the code, while using threading.Thread directly did not. So I think it's worth a warning.

Note that none of the examples in the docs handle this tricky case in the way you suggest. That might be a good addition to the docs. :-)

Copy link
Contributor

@graingert graingert Aug 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd definitely like to add this as an example/recipe.

My use case is, I'm using ThreadPoolExecutor as a synchronous structured concurrency primative where I manage cancellation myself.

This is in distributed.utils_test.loop_in_thread and anyio's BlockingPortal

My qualms about adding this as an example to the stdlib is that to do this I'm using c.f.Future in a way that's documented as incorrect

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a really fuzzy line between "it is recommended that ThreadPoolExecutor not be used for long-running tasks because of this tricky case" and "this tricky case means that to use ThreadPoolExecutor for long-running tasks, you should do it like this ..." It seems like a small change to me if you add the example.

I was basing the recommendation off of (1) a core contributor saying that it's not a good fit for long-running tasks, and (2) my personal experience with confusing issues in using it for long-running tasks. I tend to agree with @zooba that a recommendation against is not the same as saying that it's incorrect. (As a contrived example, you might recommend against using cpython for tasks that cannot handle gc pauses, but someone might do it anyway and call gc.disable() to avoid the problem. That can cause other, well-documented problems, but presumably they know what they're doing.)

@lucaswiman lucaswiman deleted the document-ThreadPoolExecutor-limitation branch August 8, 2022 16:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants