-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
MAINT Param validation: decorate all estimators with _fit_context #26473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only nitpcks. I check that we did not forget any file and looked at the _validate_params
and if it makes sense to skip or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't prefer_skip_nested_validation=False/True
here redundant now that we've added them to the parameter validation definitions themselves?
sklearn/calibration.py
Outdated
@@ -318,6 +319,7 @@ def _get_estimator(self): | |||
|
|||
return estimator | |||
|
|||
@_fit_context(prefer_skip_nested_validation=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we'd need a comment in the code whenever these are False.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Sorry @adrinjalali, I don't understand your question. Redundant with what ? |
What I mean is, in functions, |
Here we're setting it for estimators. Every estimator validates the parameters in the
|
# we don't want to validate again for each call to partial_fit | ||
partial_fit_and_fitted = ( | ||
fit_method.__name__ == "partial_fit" and _is_fitted(estimator) | ||
) | ||
|
||
if not global_skip_validation and not partial_fit_and_fitted: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about computing partial_fit_and_fitted
inside if not global_skip_validation:
? I don't know if _is_fitted(estimator)
is always fast, because it may check all parameter names of the estimator or call estimator.__sklearn_is_fitted__()
if present. It might be a good idea to avoid it when not necessary.
Follow-up of #25815
This PR actually applies the decorator to all estimators.
There are situations where we don't want to skip inner validation (
prefer_skip_nested_validation=False
):metric
andmetric_params
). Note that if the user only passes a callable we want to skip inner validation because the args passed to the callable come from us and not from the user.Estimators receiving cv objects can skip inner validation because cv objects are not validated yet. When we decide to validate cv objects we'll need to revisit this.
Something that I missed in #25815 is that for partial_fit we don't want to validate if fit or partial_fit has already been called (which was not done properly for all estimators). I added such a mechanism in the
_fit_context
decorator.