-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
FIX remove all mention to externals.joblib in the codebase #12345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX remove all mention to externals.joblib in the codebase #12345
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, apart for the comment below and there is a flake8 failure.
Thanks for improving this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Should this be backported to 0.20.1 ?
+1 |
So my comments above are not exactly on the right lines :) but the general point is that I think we should not expose public attributes |
Ok I tried to remove all direct import of Note that I also exposed |
Hmm, right I don't know anymore, other opinions on this would be welcome @jnothman @ogrisel @amueller It depends on how you think the interaction between scikit-learn and joblib will evolve in the future. In 0.20.0 we have exposed from sklearn.externals.joblib import Parallel
# or
from sklearn.utils._joblib import Parallel is better IMO than from sklearn.utils import Parallel so that bugs are reported to the right issue tracker and joblib gets the recognition it deserves. Maybe we could have exposed just What happens if say in v0.15 joblib deprecates or renames |
the public/private state of Either way I feel like we need to resolve this for 0.20.1. |
(FYI I also had vetoed vendoring loky inside vendoring joblib very explicitly and that was ignored...) |
For reference:
|
hm... I'd still vote deprecate everything and unvendor. That will probably break all pickles in two versions but will remove some major headaches going forward. |
The advantage of vendoring (or pinning) is that, unless the user does something very explicit, we know which version of joblib they're using when handling issues around a particular version. I assume @GaelVaroquaux would have an opinion here. I don't think we need to provide |
Whether we vendor joblib, and how much of its API we are exposing as a public API of scikit-learn are two somewhat different/unrelated concerns. Let's move this discussion to a separate issue so it doesn't get lost in this PR, I'll try to open one later today with a summary of different points of view. |
@rth I don't think they are entirely separate, unless you would consider providing an interface without vendoring. Which we could do - but it would seem pretty strange to me. |
Can you elaborate? How else would you fix the problem this PR addresses? |
Well we never use sklearn.externals.joblib.load in our codebase... so we don't strictly need to support it. But we can. |
Okay, so prefix them with joblib_
|
Tried to write some summary on the joblib integration in #12447 |
I guess in the interest of getting 0.20.1 out we should probably table the larger discussion and finish this up in some way for now? |
I am not sure I got the general concensus here. I can rename Or I add an extra layer for all joblib-related import such that everything is imported under I can wrap this up tomorrow once given the green-light on the chosen solution. |
I don't have a strong preference between the two. Having a more clear namespace for joblib seems desirable, but that would also require moving all the other functions and objects again, right? |
This is not so complicated, I am happy to do it if needed.
…On Mon, 29 Oct 2018 at 18:26, Andreas Mueller ***@***.***> wrote:
I don't have a strong preference between the two. Having a more clear
namespace for joblib seems desirable, but that would also require moving
all the other functions and objects again, right?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#12345 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADKs-RFgXjbh0KH83ENSp6fKyoHtt5khks5upzo3gaJpZM4XVklB>
.
|
I will move all the exposed functions in |
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice! Thanks for your patience on this one @tomMoral !
LGTM. Will wait for a day or so before merging, in case there are other comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deprecation is not reflected in doc/modules/classes.rst
@jnothman I updated |
Also: should we say here that we're planning/considering completely unvendoring in the future
I think that we should. It's good to give a heads up.
|
The changes to classes.rst look fine |
@jnothman are you ok with doing it this way? You seemed a bit concerned about the change earlier. It seems the easiest way forward to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
doc/whats_new/v0.20.rst
Outdated
- |API| Removed all mentions of ``sklearn.externals.joblib``, and deprecated | ||
joblib methods exposed in ``sklearn.utils``, except for | ||
:func:`utils.parallel_backend` and :func:`utils.register_parallel_backend`, | ||
which impact the behavior of `sklearn`. Other functionalities are part of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"which allow users to configure parallel computation in scikit-learn".
Thanks @tomMoral |
yay! |
Recently I changed all my code to use sklearn.externals.joblib instead of joblib alone. What should I use that will be compatible in the future? Would I be able still use the joblib from sklearn or will this be deprecated in future versions? Thanks. |
I think we would recommend using raw joblib except for parallel_backend.
|
…ls.joblib (scikit-learn#12345)" This reverts commit ebed7fd.
…ls.joblib (scikit-learn#12345)" This reverts commit ebed7fd.
This PR intends to fix the failure seen in the CRON job using master branch of
joblib
. (See here)It removes all mention to
externals.joblib
in the codebase and replace it withsklearn.utils
.Note that for testing purposes, it is necessary to add the module
joblib
inutils
to be able to access more advanced functionality ofjoblib
without importing directly the module. I am not sure this is the best level to expose this, so let me know if this should only be access fromsklearn.utils._joblib
.