-
-
Notifications
You must be signed in to change notification settings - Fork 26k
[MRG+1] Restructure access to vendored/site Joblib #11471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This pull request introduces 2 alerts and fixes 2 when merging 7093fc1 into a2b5645 - view on LGTM.com new alerts:
fixed alerts:
Comment posted by LGTM.com |
This pull request introduces 2 alerts and fixes 2 when merging 314daa1 into a2b5645 - view on LGTM.com new alerts:
fixed alerts:
Comment posted by LGTM.com |
This pull request fixes 2 alerts when merging e567b8f into a2b5645 - view on LGTM.com fixed alerts:
Comment posted by LGTM.com |
This is looking green. Ready for review. |
This pull request fixes 2 alerts when merging 7785ab6 into a2b5645 - view on LGTM.com fixed alerts:
Comment posted by LGTM.com |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
+1 for merge.
Should we add in the travis matrix a job that tests with SKLEARN_SITE_JOBLIB=1 (this should probably be done in a separate PR).
+1 as well, it sounds like a safe way to handle the switch.
It's already tested in the cron job that checks the master branch of everything. We could also update one on the existing entries to use the site joblib. No need to extend the size of the build matrix for this. |
LGTM it works. |
I will update the PR. |
sklearn/externals/_joblib.py
Outdated
@@ -5,11 +5,12 @@ | |||
|
|||
# An environment variable to use the site joblib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe sklearn/externals/_joblib.py
can be moved to sklearn/utils/_joblib.py
.
My reasoning is that _joblib.py
is not really external but our own code to do the switching between sklearn.external.joblib
and joblib
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I will do that as well.
It's already tested in the cron job that checks the master branch of everything.
Great! That should be enough.
|
I added another test and found a bug: cab9e0a Tests now pass both with and without |
A few things we may want to think about:
|
Indeed. If we accept one more level of nesting we could have:
# An environment variable to use the site joblib
if _os.environ.get('SKLEARN_SITE_JOBLIB', False):
from joblib import __all__
from joblib import *
from joblib import __version__
from joblib import logger
else:
from ..externals.joblib import __all__
from ..externals.joblib import *
from ..externals.joblib import __version__
from ..externals.joblib import logger to expose the full joblib namespace. |
+1 for having this in |
@jnothman my last commit put the "public" joblib symbols under |
My reasoning for using
|
I understand that we want to fix the parent issue but I was under impression that in the long run, we also wanted people to stop importing
|
I agree with you @jnothman. However, I also have the feeling that if in the future joblib introduces a new set of toplevel symbols that we want to use in scikit-learn, we will start to crowd the Also I am wondering if it might not be useful to do: >>> from sklearn.utils import joblib to get a natural hold on the full joblib namespace as sklearn sees it in test scripts / interactive sessions. This will still be possible with the later patterns though: >>> from sklearn.utils import _joblib as joblib so maybe this later point is not a big problem in the end. I am still worried about the first point though. |
Parallel is different to Memory here. We ask users to sometimes pass a Memory object as a parameter. Should they have to separately install Also, does |
The namespace is crowded already... I am not persuaded by this as a sufficient counter-argument. But shrug. It's not that important. |
Indeed, the user needs to use the |
so can we find consensus here? I'm okay with utils.joblib.x, but I think
straight utils.x is less confusing
|
so can we find consensus here? I'm okay with utils.joblib.x, but I think
straight utils.x is less confusing
I think that I would prefer utils.joblib.x. I find it more explicit.
|
Alright, I'll revert my last commit. |
This reverts commit 71241e0.
I plan to merge this without waiting for appveyor if travis is green. |
PEP8 failure :(. You'll need "# noqa" on some lines. |
We don't care no? Let me wait for the real tests to pass. |
OK. I tend not to like pyflakes warnings. |
Alright pushed. |
Merged. |
Thankfully the circle errors have stopped, but I don't think we quite
reached consensus here, and we've currently got sklearn.utils.Parallel etc,
against the opinions of @GaelVaroquaux and @ogrisel. If you think it's
right to put it in sklearn.utils.joblib, please don't mind me.
|
In order to fix #11408, this swaps
joblib
and_joblib
. It however, allows users to access joblib'sMemory
orParallel
functionality without accessingsklearn.externals._joblib
by importingMemory
,Parallel
, etc. intosklearn.utils
.