FEAT multioutput routes metadata #22986

adrinjalali · 2022-03-29T10:37:23Z

This is the first PR which handles deprecation of old code where users pass metadata w/o setting request values. This PR adds the machinery to do so, and adds routing to MultiOutput estimators.

@agramfort may be interested in this as well.

ping @jnothman @thomasjpfan @lorentzenchr

This reverts commit a663b86.

…output

jnothman

This is one of the cases where the current system just works, and the proposal seems highly bureaucratised :(

I do wonder if we should just be using fit params without distinguishing from partial_fit. Did we have use cases for distinguishing fit and partial_fit params?

adrinjalali · 2022-03-29T11:49:27Z

This is one of the cases where the current system just works, and the proposal seems highly bureaucratised :(

I started writing this:

Another alternative that I have explored for meta-estimators such as this one, which "only" pass metadata to the single sub-estimator they have, is to behave like a consumer rather than a router, and expose their sub-estimator's requests.

And realized there could potentially an issue with this implementation since MultiOutputClassifier implements its own score, which would make it a consumer and a router. In this case it's not an issue since the score method doesn't accept any metadata.

I do wonder if we should just be using fit params without distinguishing from partial_fit. Did we have use cases for distinguishing fit and partial_fit params?

If we do that, do we want to also consider predict_proba, predict_log_proba, decision_function, and predict all having the same request and we could have common tests making sure they all accept the same metadata in their signature. Should we do that?

adrinjalali · 2022-04-05T15:10:05Z

This is one of the cases where the current system just works, and the proposal seems highly bureaucratised :(

I'm not sure what to do with this concern @jnothman 😆

But I also don't think it's too bad. If this estimator would accept sample_weight for its score (which it does override, but somehow doesn't take sample_weight into account), then it would be a router AND a consumer, and the SLEP works perfectly fine.

I do wonder if we should just be using fit params without distinguishing from partial_fit. Did we have use cases for distinguishing fit and partial_fit params?

Seems like we're keeping them separate then? (#22988)

jnothman

Sorry for the very slow review!!

sklearn/exceptions.py

sklearn/multioutput.py

sklearn/tests/test_metadata_routing.py

sklearn/tests/test_metaestimators_metadata_routing.py

sklearn/utils/_metadata_requests.py

sklearn/utils/estimator_checks.py

…output

adrinjalali · 2022-06-02T12:14:02Z

The summary of the latest change:

Previously the code would try to route parameters, and if there was an exception, it would try to assume all those parameters as requested, and then route again.

Now, the exception includes enough information that when raised, we can check if those relevant parameters can be routed and warned or if there should be an error.

It also made me change the signature of warn_on method to include which metadata for each method should raise a warning. This means for instance, if previously fit was routing sample_weight but not other metadata, we would warn on sample_weight having not request value, but raise an error on others, since the routing of other metadata is newly implemented.

glemaitre

At a first read, it seems pretty straightforward.
I will have to make an additional pass and look more into the details of the _is_default_request just to be sure that there is nothing that can go sideways.

sklearn/multioutput.py

sklearn/tests/test_metadata_routing.py

sklearn/tests/test_metaestimators_metadata_routing.py

thomasjpfan

Thanks for the update. I hope the complexity of the routing mechanism does not get any higher than this.

sklearn/multioutput.py

sklearn/tests/test_metaestimators_metadata_routing.py

sklearn/utils/_metadata_requests.py

sklearn/multioutput.py

sklearn/utils/_metadata_requests.py

jnothman

Yes, this is certainly much neater than the more magic solution before. I'm not sure we're going to quite achieve @thomasjpfan's request that "the complexity of the routing mechanism does not get any higher than this." but at least this lays most of the groundwork, so that the rest of the implementations are complicated, not complex. Congrats @adrinjalali!

In terms of the impact on users, however, this remains one of the cases where the implicit behaviour was actually okay: forwarding on any kwargs in fit to the child estimator. Now we will force the user to explicitly request these. I'd like to know from others: Is this a reasonable burden on the user? Is there a way to avoid it without major inconsistencies between simple routing as presented here and the more complex routing of a GridSearchCV or Pipeline (where there are many candidate destinations)?

sklearn/multioutput.py

sklearn/tests/test_metadata_routing.py

sklearn/utils/_metadata_requests.py

sklearn/utils/estimator_checks.py

…output

adrinjalali · 2022-07-16T15:33:10Z

Opened #23928 to discuss the verbosity issue.

examples/plot_metadata_routing.py

glemaitre · 2022-07-17T12:55:45Z

The lines reported not covered are the dummy estimators. I think this is fine if they are not tested.

glemaitre · 2022-07-17T12:56:28Z

Thanks @adrinjalali

adrinjalali added 25 commits March 10, 2022 22:25

FEAT add metadata routing support to scorers

1fbda38

clarify doc on not mutating scorers

51baf9d

improve tests

78638c4

remove unused import

ea317bc

fix tests

6c6c2f1

Christian's comments

b24c3ad

set_score_request -> with_score_request on scorers

a663b86

warn on overlapping kwargs and metadata

e838a71

add references to docs

0c6e2c4

add a note on custom scorers

0c61c05

Merge remote-tracking branch 'upstream/sample-props' into slep6-scorers

7840a6c

Revert "set_score_request -> with_score_request on scorers"

ba36307

This reverts commit a663b86.

set_score_request now mutates the instance

2bea203

don't test repr

dba266f

fix and test _passthrough_scorer

203b4e4

Joel's comments

4e32f31

writing test

da3c6a6

Thomas's comments

44f91ce

Merge branch 'slep6-scorers' into slep6/multioutput

3a5ca34

for Thomas

7421fe4

...

380b20a

Merge remote-tracking branch 'upstream/sample-props' into slep6/multi…

8c19c8c

…output

...

1c64ec0

all

f15ef55

Merge remote-tracking branch 'upstream/sample-props' into slep6/multi…

3e5048e

…output

github-actions bot added the module:utils label Mar 29, 2022

jnothman reviewed Mar 29, 2022

View reviewed changes

adrinjalali added the No Changelog Needed label Mar 29, 2022

adrinjalali added this to the 1.2 milestone Mar 29, 2022

fix docstrings

54b093b

jnothman reviewed Apr 30, 2022

View reviewed changes

adrinjalali added 8 commits May 12, 2022 16:53

Merge remote-tracking branch 'upstream/sample-props' into slep6/multi…

9a60ded

…output

minor edits

268ae56

Merge remote-tracking branch 'upstream/sample-props' into slep6/multi…

aeb4ed0

…output

staking a stab at param specific deprecation

fb36733

fix tests

1baea75

add test for new code

a985e83

remove unused _assume_requested

cb77212

document exception parmaters

3053582

glemaitre reviewed Jun 3, 2022

View reviewed changes

try a different URL

a8bc9ca

thomasjpfan reviewed Jun 11, 2022

View reviewed changes

jnothman approved these changes Jun 14, 2022

View reviewed changes

adrinjalali added 2 commits July 16, 2022 15:18

address comments

2fd59b0

Merge remote-tracking branch 'upstream/sample-props' into slep6/multi…

8329a6f

…output

adrinjalali mentioned this pull request Jul 16, 2022

RFC SLEP006: verbose vs non-verbose declaration in meta-estimator #23928

Open

adrinjalali added 2 commits July 16, 2022 18:02

add docs

c904c28

fix example code

8eb85cf

glemaitre reviewed Jul 16, 2022

View reviewed changes

examples/plot_metadata_routing.py Outdated Show resolved Hide resolved

adrinjalali added 2 commits July 16, 2022 18:31

remove extra backtick

a056d25

test _is_default_request and remove some unused lines

916e92b

glemaitre merged commit b93a9e6 into scikit-learn:sample-props Jul 17, 2022

adrinjalali deleted the slep6/multioutput branch July 17, 2022 12:56

haiatn mentioned this pull request Jul 29, 2023

Refactor metadata routing classes used in tests #23918

Closed

Uh oh!

FEAT multioutput routes metadata #22986

FEAT multioutput routes metadata #22986

Uh oh!

Conversation

adrinjalali commented Mar 29, 2022

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

adrinjalali commented Mar 29, 2022

Uh oh!

adrinjalali commented Apr 5, 2022

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adrinjalali commented Jun 2, 2022

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

thomasjpfan left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

adrinjalali commented Jul 16, 2022

Uh oh!

Uh oh!

glemaitre commented Jul 17, 2022

Uh oh!

glemaitre commented Jul 17, 2022

Uh oh!

Uh oh!

thomasjpfan left a comment •

edited

Loading