sample-props alternate implementation #20350

adrinjalali · 2021-06-24T15:56:01Z

This is an alternate to #16079 which is an implementation of SLEP006 Routing sample-aligned meta-data.

The metadata_requests.py file includes most of the implementation. There are some challenges left which we need to figure out how to tackle.

You can see SLEP examples implemented in test_props.py as test_slep_case* functions, and the documentation under metadata_routing.rst

Open Questions

should the sentence at the end of getting_started.rst be there? https://github.com/scikit-learn/scikit-learn/pull/20350/files#r659308253
a few outstanding questions here: sample-props alternate implementation #20350 (review)

Closes #9566, Closes #15425, Closes #3524, Closes #15425,
Fixes #4497, Fixes #7136, Fixes #13432, Fixes #4632, Fixes #7646 (add test), Fixes #8127 (add test), Fixes #8158, Fixes #7308, Fixes #21134, Fixes #18159, Fixes #20349
Enables #6322,
TODO:
#12052, #8710, #15282, #2630, #8950, #11429, #15282, #18028, #19465, #20167

Note: since this started from where #16079 was, there are still leftovers from that PR, cleaning them up.

adrinjalali

What do we generally want to do with estimators which are not meta-estimators per say, but they do have an estimator constructor arg, and forward {method}_params to fit or score. Should they do any kind of validation, or should they just pass everything along? Right now RFE does validation, but transformed regressor doesn't. I'm not really sure which way to go there.

doc/getting_started.rst

adrinjalali · 2021-10-01T15:22:05Z

sklearn/linear_model/_ridge.py

+                .fit_requests(sample_weight=True)
+                .score_requests(sample_weight=True)
+            )
+            # The old behavior would be sample_weight=False for "score"
+            # Do we want to "fix" the issue, or keep the old behavior?


It seems to completely ignore sample_weight

sklearn/metrics/_scorer.py

sklearn/utils/metadata_requests.py

adrinjalali · 2021-10-01T18:32:45Z

The API side seems getting close to where we're happy with ii (@jnothman tell me if I'm wrong). I'm wondering how we can make this easier to move forward. Should we have a branch on the repo, where we start by merging the API/base stuff in a PR first, and then fixing estimators one by one, and then add some common tests?

@glemaitre you showed interest in the last meeting, any help is welcome :)

adrinjalali · 2021-10-01T18:33:08Z

It'd also be nice if @thomasjpfan could check the API once again.

jnothman · 2021-10-03T13:14:43Z

With apologies for my general absence, that sounds like a good strategy for breaking down the pull request. I do consider TransformedTargetRegressor, RFE, etc to be meta estimators. The meta estimator should be validating the metadata keys passed to it, but not the values unless they are doing something like splitting the dataset. Proper routing, aliasing and validation in these meta estimators ensures capabilities like allowing them too to consume metadata unambiguously.

adrinjalali added 30 commits January 9, 2020 17:58

first try...almost

b0b6fd1

working pipeline

b33940d

grid search

a397748

adding some tests

6a3b725

Merge remote-tracking branch 'upstream/master' into sample-props

0b38a88

pep8

e2ca8da

Merge remote-tracking branch 'upstream/master' into sample-props

3472b34

moving function out of base class to validaiton, adding docs

5113015

Merge remote-tracking branch 'upstream/master' into sample-props

c70474f

Merge remote-tracking branch 'upstream/master' into sample-props

8810724

refactor and simplify code

4486ec0

merged master, half way trough scoring

9b47761

Merge remote-tracking branch 'upstream/master' into sample-props

8a116f5

first scoring param in GS works

987b289

simplify set_props_request

09ae61f

rename to medata_request

2ef3bad

fix docstring and None inputs

7f8ae6f

fix scorers' issues

5bd1c83

minor cleanup

d8b55af

tests are okay with parameters not being passed but requested

5b8fd65

accept old style fit params

9c1b772

make test_pipeline pass, ignore future warnings

8eca900

include sample_weights in **kwargs in metrics

8948d45

pep8

1851419

separate _MetadataConsumer

421a673

don't pass sample_weight=None in metrics

1a1c2b9

Merge remote-tracking branch 'upstream/master' into sample-props

50317ee

cleanup and ignore private attrs set in __init__ in common tests

4967802

rfe passes score params in fit, and cleanup

32d8020

fixes to pass model_selection tests

845204b

adrinjalali mentioned this pull request Jul 22, 2021

RFC Private attributes copied in clone #20585

Closed

adrinjalali added 2 commits July 22, 2021 16:41

fix GB's routing and metadata request

4056721

Merge remote-tracking branch 'upstream/main' into sample-props-alternate

075e810

adrinjalali mentioned this pull request Jul 23, 2021

RFC Make estimator tags available at class level instead of instance level #20590

Closed

adrinjalali added 3 commits July 23, 2021 16:35

fix n_classes check in gp

e65848e

better overwrite semantics

4cd44eb

Merge remote-tracking branch 'upstream/main' into sample-props-alternate

d985145

thomasjpfan mentioned this pull request Aug 9, 2021

ENH allow extra params to be copied in clone #20681

Closed

This was referenced Sep 23, 2021

DRAFT: Implements fit params for RFECV #21113

Closed

Add sample_weight fit param for Pipeline #18159

Closed

ogrisel mentioned this pull request Sep 27, 2021

FIX CalibratedClassifierCV should not ignore sample_weight if estimator does not support it #21143

Closed

Merge remote-tracking branch 'upstream/main' into sample-props-alternate

c707d77

ogrisel mentioned this pull request Sep 30, 2021

Meta-estimator will ignore sample_weight when a Pipeline is passed #21134

Closed

adrinjalali added 8 commits October 1, 2021 11:40

add 'smart' overwrite option

ee90987

fix RFE param validation in fit and score

855064f

Merge remote-tracking branch 'upstream/main' into sample-props-alternate

af1dea5

DOC remove link to doc from getting started

d1ca507

improve docstring in TargetRegressor

b5fc380

fix overwrite in _gb.py

1034bce

model -> est in _ridge.py

1300867

remove score_params from make_scorer

7341975

adrinjalali commented Oct 1, 2021

View reviewed changes

This was referenced Oct 8, 2021

Base sample-prop implementation and docs #21284

Closed

Enable mixed ensembles with estimators that do & don't accept the sample_weight fit_param #20167

Closed

adrinjalali mentioned this pull request Dec 27, 2021

Base sample-prop implementation and docs (alternative to #21284) #22083

Merged

jnothman closed this Mar 12, 2022

jnothman mentioned this pull request Mar 19, 2022

SLEP006 - Metadata Routing task list #22893

Open

28 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

sample-props alternate implementation #20350

sample-props alternate implementation #20350

Uh oh!

adrinjalali commented Jun 24, 2021 •

edited

Loading

Uh oh!

adrinjalali left a comment

Uh oh!

Uh oh!

adrinjalali Oct 1, 2021

Uh oh!

Uh oh!

Uh oh!

adrinjalali commented Oct 1, 2021

Uh oh!

adrinjalali commented Oct 1, 2021

Uh oh!

jnothman commented Oct 3, 2021 via email

Uh oh!

Uh oh!

Uh oh!

sample-props alternate implementation #20350

sample-props alternate implementation #20350

Uh oh!

Conversation

adrinjalali commented Jun 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

adrinjalali Oct 1, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

adrinjalali commented Oct 1, 2021

Uh oh!

adrinjalali commented Oct 1, 2021

Uh oh!

jnothman commented Oct 3, 2021 via email

Uh oh!

Uh oh!

adrinjalali commented Jun 24, 2021 •

edited

Loading