-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
BUG Passes global configuration when spawning joblib jobs #17634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG Passes global configuration when spawning joblib jobs #17634
Conversation
Thanks for detecting this and the issue at joblib! Let's see if there is a way to fix it on the joblib side (or maybe our config mechanism needs to be revisited). Using a parallel wrapper means that external users of joblib + scikit-learn would still run into this issue.. |
@rth do you have a proposal for a revised config handling? |
I have a "fun" proposal here: joblib/joblib#1071 (comment) (Global config + doing things in parallel = fun) |
We could just use os.environ to store the config? |
Storing them in the environment may not work: joblib/joblib#1064 |
Doesn't that depend on the backend? It should work as long as all those processes are on the same machine, doesn't it? |
Maybe we could pass it as environment and make joblib / loky detect that environment has changed and re-spawn workers accordingly. That might be a bit expensive though. |
The threading backend should have no problem. multiprocessing + fork should work out of the box but it's broken in other ways and not available on windows. loky and dask will need explicit config init. |
I think I'm okay with this, but I'd probably move the code to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from not being comfortable with the use of utils.fixes for something other than a backport, this looks great
build_tools/circle/linting.sh
Outdated
@@ -136,13 +136,13 @@ check_files() { | |||
|
|||
if [[ "$MODIFIED_FILES" == "no_match" ]]; then | |||
echo "No file outside sklearn/externals and doc/sphinxext has been modified" | |||
else | |||
# else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still needed?
Should we use |
I think I can come to accept fixes :)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from not being comfortable with the use of utils.fixes for something other than a backport, this looks great
This needs to be fixed in joblib upstream, and then it'll be removed, from that perspective, fixes.py
is the right place for it I think.
Otherwise this LGTM. I'm happy to have it as a temporary workaround.
sklearn/utils/fixes.py
Outdated
@@ -196,3 +200,21 @@ def _take_along_axis(arr, indices, axis): | |||
|
|||
fancy_index = tuple(fancy_index) | |||
return arr[fancy_index] | |||
|
|||
|
|||
def delayed(function, *args, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add a comment like
remove when https://github.com/joblib/joblib/issues/1071 is fixed
?
This looks reasonable but would one among @lesteve @ogrisel or @tomMoral be also be able to review this, following discussion in joblib/joblib#1071? |
Could you please also check that the vendored version of |
I looked at: from tags 0.11 to 0.16 and the function does the same thing. After reading its implementation, we can use a simplified version in our def delayed(function):
"""Decorator used to capture the arguments of a function."""
@functools.wraps(function)
def delayed_function(*args, **kwargs):
return _FuncWrapper(function), args, kwargs
return delayed_function This way we do not need to depend on a vendored version. |
Hi @scikit-learn/core-devs, two approvals here: is something left? Or is it good to merge? |
Since the last commit and hence the CI runs were from almost 3 weeks ago, I'm happy to merge once we do another merge with the current master. Would you mind updating @thomasjpfan ? I feel like we agreed to having this in the meantime. |
Merged with master and it still passes. |
1 similar comment
Merged with master and it still passes. |
woohoo 🥳 let's do this then 👍 |
…rn#17634) * WIP * BUG Passes context to joblib jobs * BUG Fix * BUG Fix * CLN Moves delayed into fixes * CLN Adds linter * REV Revert diff * DOC Adds comment for removal * Use a simple implementation
On master the global configuration flag would not be passed joblib jobs with a multiprocess backend:
XREF: joblib/joblib#1071