-
-
Notifications
You must be signed in to change notification settings - Fork 26k
ENH Add Deprecating Position Arguments Helper #13311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Your agility across different kinds of development problems is impressive,
@thomasjpfan. Thanks for the good work.
|
Is the idea that after two releases, we remove the "*" that gets added in functions? (I find that it really looks ugly. I could see it confusing beginners. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to make sure that this warning raises in our test suite and doc building.
We should also document this is what's new and perhaps more clearly somewhere in the user guide??
The only problem I see at the moment is that the * doesn't show in the API reference documentation. I don't think it's a big problem, since almost all args can be passed as kwargs.
Do we need to test support for non-__init__
methods?
@@ -1788,3 +1788,10 @@ def test_penalty_none(solver): | |||
"LogisticRegressionCV", | |||
lr.fit, X, y | |||
) | |||
|
|||
|
|||
def test_logistic_warns_with_args(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this. it's not practical or helpful to have this for everything.
sklearn/utils/validation.py
Outdated
args_msg = ['{}={}'.format(name, arg) | ||
for name, arg in zip(kwonlyargs[:extra_args], | ||
err_args[-extra_args:])] | ||
warnings.warn("Should use keyword args: " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something more forceful. "Pass xxxx as keyword args (e.g. xxxx=...). From version 0.24 passing these as positional arguments will result in an error."
Then we need to obligate ourselves to finish all the decorating by 0.22.
sklearn/utils/validation.py
Outdated
@@ -936,3 +937,38 @@ def check_non_negative(X, whom): | |||
|
|||
if X_min < 0: | |||
raise ValueError("Negative values in data passed to %s" % whom) | |||
|
|||
|
|||
def warn_args(f): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe call it deprecate_positional_args
?
sklearn/utils/validation.py
Outdated
def inner_f(*args, **kwargs): | ||
extra_args = len(args) - len(orig_spec) | ||
if extra_args > 0: | ||
# ignore first 'self' argument for class methods |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class -> instance
sklearn/utils/validation.py
Outdated
orig_spec = argspec.args | ||
|
||
# Assumes class method has 'self' as first argument | ||
is_class_method = orig_spec and 'self' == orig_spec[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is inspect.ismethod
applicable here? Maybe it's not bound yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_class_method -> is_method
No, the idea is that that
|
Codecov Report
@@ Coverage Diff @@
## master #13311 +/- ##
==========================================
- Coverage 96.8% 92.49% -4.32%
==========================================
Files 393 420 +27
Lines 71859 74830 +2971
Branches 7866 0 -7866
==========================================
- Hits 69564 69211 -353
- Misses 2272 5619 +3347
+ Partials 23 0 -23
Continue to review full report at Codecov.
|
Does this need a slep/discussion/vote? |
I wonder what happens with auto-completion e.g. in Jupyter etc. Would it still work even with the decorator? |
Here's the SLEP: scikit-learn/enhancement_proposals#19
I put the output for ipython and VSCode in the SLEP |
@thomasjpfan as I pasted in the SLEP, the order of items in the hint shown by ipython seem different with the decorator, any chance we could fix/change that? |
@adrinjalali Are you using the |
Yeah I took the decorator from this PR. I haven't tried jupyterlab though. So you're saying you see something else than the one I pasted in the SLEP? |
Yup. I get the same auto complete results for the wrapped and unwrapped version of the function. My implementation calls decorates the wrapped function with |
I quite like their idea of |
@adrinjalali I once started writing a library to do that: |
SLEP009 is all but accepted. Please resolve conflicts and update for review. |
Okay lets see how this looks like with #15005 (comment) |
noise_scale = 0.01 | ||
|
||
# Create S-curve dataset | ||
X, y = datasets.samples_generator.make_s_curve(n_samples, random_state=0) | ||
|
||
# Compute isomap embedding | ||
iso = manifold.Isomap(n_components, 2) | ||
iso = manifold.Isomap(n_neighbors, n_components=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This really makes me want to do: Isomap(*, n_neighbors=5, n_components=2,...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with that
Do you have a way of systematically considering all the signatures? |
1 similar comment
Do you have a way of systematically considering all the signatures? |
For working on this PR, I had a code snippet that checked for keyword only arguments based on order: from inspect import signature, Parameter
from sklearn.utils.testing import all_estimators
from collections import defaultdict
whitelist = set([
'n_components', 'estimator', 'base_estimator', 'n_clusters', 'n_neighbors', 'steps',
'regressor', 'transformers', 'n_estimators', 'transformer_list', 'score_func', 'func',
'degree', 'estimators'])
public_estimators = [tup for tup in all_estimators() if not tup[0].startswith("_")]
estimators_by_module = defaultdict(list)
for name, est in public_estimators:
module = est.__module__
estimators_by_module[module].append((name, est))
for module, estimators in estimators_by_module.items():
print(module)
for name, est in estimators:
sig = signature(est)
if sig.parameters:
params = list(sig.parameters.items())
first_name, first_param = params[0]
if first_name in whitelist:
if first_param.kind == Parameter.KEYWORD_ONLY:
print("*", name, first_param, "INCORRECT KEYWORD_ONLY")
params = params[1:]
# remove parameters without defaults
params = [p for p in params if p[1].default != Parameter.empty]
# the rest must be keywords only
if not all(p[1].kind in [Parameter.KEYWORD_ONLY]
for p in params):
print("*", name, "THE REST SHOULD BE KEYWORD_ONLY") I added a test, |
sklearn/tests/test_common.py
Outdated
"name, Estimator", | ||
[(name, Estimator) | ||
for name, Estimator in all_estimators() if not name.startswith("_")]) | ||
def test_estimators_keyword_only(name, Estimator): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test makes sure that the whitelist is POSITION_OR_KEYWORD and that the rest of the parameters (without defaults) is KEYWORD_ONLY.
I agree with #13311 (comment), can we first merge a simpler PR where only the decorator is introduced? |
Honestly I'm still uncomfortable about the long whitelist (e.g., |
After we merge this decorator, and we go through each module group by group. We would need to wait till we go through every module before we can release. Otherwise there will be some estimators that have this warning and others that do not.
@qinhanmin2014 What do you think the whitelist should be? |
Not sure, but shorter than current version, e.g., perhaps n_components, n_clusters and n_neighbors should not be in the list. But as I've said in gitter, if this is the final decision, I won't oppose. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good now. We need probably another PR to explain keyword only args and how we use them in sklearn I think. (Another example of where a blog would be nice)
This PR was stripped down to only include the decorator. It doesn’t use it anywhere. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, given that this only add the helper function. Thanks @thomasjpfan !
Reference Issues/PRs
Addresses #12805
What does this implement/fix? Explain your changes.
Adds a function decorator,
warn_args
, that can wraps a function and issues a warning when any of the positional arguments passed after*
will issue a warning.For functions:
Calling
dbscan(X, 0.1, 5)
will result with aDeprecationWarning
:For classes
Calling
LogisticRegression('l2', True)
will result with aDeprecationWarning
:Any other comments?
_deprecate_positional_args
uses the fact that the first element of a class method is traditionally namedself
to determine if the function is called in a instance method like__init__
.