-
-
Notifications
You must be signed in to change notification settings - Fork 34
Completing sample props SLEP006 #43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This PR is filling in several TODOs in SLEP006, with an idea to having it ready for voting in Sept 2020, in conjunction with refinements to scikit-learn/scikit-learn#16079. |
By default, `sample_weight` will not be requested by estimators that support | ||
it. This ensures that addition of `sample_weight` support to an estimator will | ||
not change its behaviour. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure? I think right now if the underlying estimator supports sample_weight
and the user passes sample_weight
to the meta-estimator's fit
, it'll be passed to the estimator. With this default you suggest, it wouldn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right now if the underlying estimator supports sample_weight and the user passes sample_weight to the meta-estimator's fit, it'll be passed to the estimator. With this default you suggest, it wouldn't.
That's true. My suggestion here sacrifices some usability for explicitness. The idea would be that (after a deprecation or version 1.0) passing sample_weight to the meta estimator where no consumer has requested it (default) would raise an error: unexpected fit parameter. Thus the explicit use of require_sample_weight would be forced.
As well as ongoing compatibility issues (code invisibly starts using weights when someone implemented them), my concern with requesting sample_weight is also that it may be confusing if estimators request sample_weight by default but scorers do not.
Making it explicit sucks a little because people need to express sample_weight usage in two places rather than one, but I think it won't hurt too much and will make sure the reader of the code knows what's going on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say looking at the conversation in scikit-learn/scikit-learn#18159 folks would rather have this routing done implicitly?
I think I'd be in favor of that implicit routing too.
And I think scorers should ask for sample_weight
by default too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say looking at the conversation in scikit-learn/scikit-learn#18159 folks would rather have this routing done implicitly?
I think @rth's motivation there is explicitly about avoiding the coupling of the param name with the composite structure of the estimator, i.e. you need to prefix things differently depending on how many layers of pipeline it's in, or depending on what the estimator's called. I've long been frustrated by that, for parameter names too, and it's a big reason I made searchgrid. Only Solution 2 here retains this issue. It is no longer seriously being considered.
I think I'd be in favor of that implicit routing too. And I think scorers should ask for sample_weight by default too.
I get that these days we don't have a PR called "Add sample_weight support to ..." very often, but it used to be very common, and I fear that we will fit very different models if we change support for sample_weight in an estimator or metric. It will become ever more important for people to pin their version, and ever harder to know what some code does just by reading the latest docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the desire to have the behavior not change. But if for a moment we forget about the current behavior and think about what the intuitive desired behavior would be, which one would you go for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the developer's perspective it would be great to have sample weights always passed automatically: for scoring and fitting to be weighted by default.
From the maintainer's or reader's perspective, however, this is hell. To know if weighted fitting is used, you need to explicitly look up whether weights are supported in each estimator or scorer used. Switching estimators in a grid search can enable or disable weighted fitting or scoring implicitly. Upgrading scikit-learn or third-party libraries could enable weighted fitting or scoring implicitly.
All this tells me is that the straw man developer above was mistaken. All code must be written to be read, and to be read unambiguously. This developer had the intuition that implicit is better than explicit, but they were wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok fair enough. Pinging @rth @glemaitre and @NicolasHug since this is very much related to scikit-learn/scikit-learn#18159
During a deprecation period, fit_params will be handled dually: Keys that are | ||
requested will be passed through the new request mechanism, while keys that are | ||
not known will be routed using legacy mechanisms. At completion of the | ||
deprecation period, the legacy handling will cease. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have to support the combination? It'd be probably cleaner if we only support the legacy behavior if all follow that protocol?
Although it may have the issue that third party estimators won't be supporting this API at the same time maybe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How will the metaestimator know whether or not to use legacy mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I'm assuming you're talking about the estimator__param
format, and for that, at least in the Pipeline
, I thought of checking if any parameter follows that format. If yes, then we fallback to the old behavior completely. If no parameter follows that format, then we do things the new way.
I think this would cause less confusion than handling them both at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If yes, then we fallback to the old behavior completely. If no parameter follows that format, then we do things the new way.
Then that disables support for a prop explicitly requested as step_name__*
?
I don't really see what we lose by supporting both during deprecation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find understanding a code which passes some parameters to a pipeline as step_nem__*
and other parameters as param_name
confusing. Otherwise, I'm just worried about the code complexity (and potential bugs) of supporting them side by side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not yet sure how error-prone or complex this code would be.
We could have users switch on the new routing mechanism with a config until v1.0?
Do we merge this so that the full SLEP is rendered on rtd? |
Sure, it's easier to follow up on the rest of the issues if we have this up there anyway. |
Thanks for doing this SLEP and proposing associated implementations! It's certainly a challenging topic and I don't have a good visibility on all the implications/constraints. My comments are mostly on usability with pipelines as outlined in scikit-learn/scikit-learn#18159, in particular,
I would interested in exploring a mixture of option proposed in scikit-learn/scikit-learn#18159 (comment) and option 4. For sample weight that would be,
So for sample weight that would be a little similar to option 4 but less verbose and with reasonable default. I'm not yet sure how this would generalize to other sample props, maybe keeping a
At least for fit, estimators / pipelines that support sample weight need to verify invariance properties ( |
Another option for the verbosity issue is to have a utility function to say something like |
No, I'd see this as choosing to deprecate
I'd now see more likely syntax as I appreciate that this could be more usable still. Making |
https://scikit-learn-enhancement-proposals.readthedocs.io/en/latest/slep006/proposal.html