Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
bpo-22393: Fix multiprocessing.Pool hangs if a worker process dies unexpectedly #10441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
bpo-22393: Fix multiprocessing.Pool hangs if a worker process dies unexpectedly #10441
Changes from all commits
d37e360
bc08d85
4eac116
c8ba754
b36663b
f8500e2
848d304
1f93322
a172df6
4d614b3
65f6eaf
6d9c4ca
706f178
933c77a
efcc185
7c21ddd
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe avoid using
ProcessPoolExecutor
andfuture
terms, which are objects of theconcurrent.futures
package and not themultiprocessing
package.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also applies to
ApplyResult
right?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, is there any reason why we iterate on a list of of
self._cache
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to use the
_worker_state_lock
here? And in other places where_worker_handler._state
is manipulated?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens here if the
task_handler
is blocked, but we do not run_help_stuff_finish
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've only launched a single task, so what if it was scheduled on the other worker? I don't think this test is reliable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point (without the patch) you would inevitably get the same behavior: the pool hangs forever. With this patch, if you launch a single task on a broken pool (or the pool will be broken before the result is collected), you'll get the BrokenPoolError, regardless of the worker that was killed. We could keep track of sane results and try to rescue the most, but the original fix didn't look into that and it might be subject for a different PR. Similarly (matter of another PR), we could identify when the pool could be recovered (e.g., the worker died when the pool was idle waiting for tasks).