-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
gh-132969: Fix error/hang when shutdown(wait=False) and task exited abnormally #133222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…sk exited abnormally When shutdown is called with wait=False, the executor thread keeps running even after the ProcessPoolExecutor's state is reset. The executor then tries to replenish the worker processes pool resulting in an error and a potential hang when it comes across a worker that has died. Fixed the issue by having _adjust_process_count() return without doing anything if the ProcessPoolExecutor's state has been reset. Added unit tests to validate two scenarios: max_workers < num_tasks (exception) max_workers > num_tasks (exception + hang)
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
… into origin/main Skip forkserver-dependent test on Windows
@hugovk: If this is indeed a blocker for the beta, +1 for me -- it shouldn't break anything. But, as week-old issue that affects 3.12+, I think it's OK to defer it. @itamaro: Did you intend this to block the beta? |
Should have said it here, not only in Discord :) |
Thanks all, it's been looked at and sounds like it can wait, so let's defer it. |
Misc/NEWS.d/next/Library/2025-04-30-19-32-18.gh-issue-132969.EagQ3G.rst
Outdated
Show resolved
Hide resolved
…use @unittest.skipIf annotation to skip test on win32, add some more comments
… into origin/main
…s by swiching to "spawn"
Misc/NEWS.d/next/Library/2025-04-30-19-32-18.gh-issue-132969.EagQ3G.rst
Outdated
Show resolved
Hide resolved
…er" if not windows
LGTM |
Thank you @YvesDup for the thorough review of my first PR, much appreciated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not very fond of a long NEWS, especially with the "A combination of ...", but it's the full CHANGELOG, so why not.
Misc/NEWS.d/next/Library/2025-04-30-19-32-18.gh-issue-132969.EagQ3G.rst
Outdated
Show resolved
Hide resolved
@ogbiggles Would you apply the last suggestions ? |
Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
…agQ3G.rst Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com>
Does this fix supersede this one? |
The two are addressing different issues, the referenced PR will not prevent the error that I am fixing in this one. |
🤖 New build scheduled with the buildbot fleet by @encukou for commit 4f46875 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F133222%2Fmerge If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I started the buildbots; if they like the PR I'll merge it.
(Don't be alarmed if there are failures; not all buildbots are stable.)
Thanks @ogbiggles for the PR, and @encukou for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13. |
Thanks @ogbiggles for the PR, and @encukou for merging it 🌮🎉.. I'm working now to backport this PR to: 3.14. |
…ited abnormally (pythonGH-133222) When shutdown is called with wait=False, the executor thread keeps running even after the ProcessPoolExecutor's state is reset. The executor then tries to replenish the worker processes pool resulting in an error and a potential hang when it comes across a worker that has died. Fixed the issue by having _adjust_process_count() return without doing anything if the ProcessPoolExecutor's state has been reset. Added unit tests to validate two scenarios: max_workers < num_tasks (exception) max_workers > num_tasks (exception + hang) (cherry picked from commit 598aa7c) Co-authored-by: Ajay Kamdar <140011370+ogbiggles@users.noreply.github.com>
GH-135343 is a backport of this pull request to the 3.13 branch. |
…ited abnormally (pythonGH-133222) When shutdown is called with wait=False, the executor thread keeps running even after the ProcessPoolExecutor's state is reset. The executor then tries to replenish the worker processes pool resulting in an error and a potential hang when it comes across a worker that has died. Fixed the issue by having _adjust_process_count() return without doing anything if the ProcessPoolExecutor's state has been reset. Added unit tests to validate two scenarios: max_workers < num_tasks (exception) max_workers > num_tasks (exception + hang) (cherry picked from commit 598aa7c) Co-authored-by: Ajay Kamdar <140011370+ogbiggles@users.noreply.github.com>
GH-135344 is a backport of this pull request to the 3.14 branch. |
When shutdown is called with wait=False, the executor thread keeps running even after the ProcessPoolExecutor's state is reset. The executor then tries to replenish the worker processes pool resulting in an error and a potential hang when it comes across a worker that has died. Fixed the issue by having _adjust_process_count() return without doing anything if the ProcessPoolExecutor's state has been reset.
Added unit tests to validate two scenarios:
max_workers < num_tasks (exception)
max_workers > num_tasks (exception + hang)
ProcessPoolExecutor
raises exception or hangs the process whenshutdown(wait=False)
and worker throws exception #132969