FEA Add metadata routing to RidgeCV and RidgeClassifierCV #27560

OmarManzoor · 2023-10-10T12:22:21Z

Reference Issues/PRs

Towards: #22893

What does this implement/fix? Explain your changes.

Adds metadata routing to RidgeCV and RidgeClassifierCV

Any other comments?

CC: @adrinjalali @glemaitre

github-actions · 2023-10-10T12:24:02Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 513b936. Link to the linter CI: here}

adrinjalali

I need to have a closer look at this.

sklearn/linear_model/_ridge.py

doc/whats_new/v1.4.rst

sklearn/tests/test_metaestimators_metadata_routing.py

sklearn/linear_model/_ridge.py

…bled

doc/whats_new/v1.4.rst

sklearn/linear_model/_ridge.py

OmarManzoor · 2024-01-18T14:06:46Z

@adrinjalali I did some refactoring. Could you kindly have a look?

OmarManzoor · 2024-02-13T13:43:07Z

@adrinjalali Any updates on this PR?

adrinjalali

Sorry, this was lost in my notifications. LGTM other than the nits.

doc/whats_new/v1.5.rst

adrinjalali · 2024-02-21T16:33:28Z

sklearn/linear_model/_ridge.py

+    def _get_scorer(self):
+        return check_scoring(self, scoring=self.scoring, allow_none=True)
+
+    def _score_without_scorer(self, squared_errors):


a very small docstring for these two methods would be nice.

I am wondering why don't we pass score_params to it?

So for this case, we are in the case that we can only compute the MSE so there is no score_params that can be passed because we are in a single case. However, I'm concerned with sample_weight that is the only params that we could take into account for the GCV scheme:

scikit-learn/sklearn/linear_model/_ridge.py

Lines 2151 to 2156 in 1d22a48

G_inverse_diag, c = solve(float(alpha), y, sqrt_sw, X_mean, *decomposition)

if scorer is None:

squared_errors = (c / G_inverse_diag) ** 2

alpha_score = self._score_without_scorer(squared_errors=squared_errors)

if self.store_cv_results:

self.cv_results_[:, i] = squared_errors.ravel()

So I see that we use sample_weight to resolve the problem. However, the squared errors are not weighting with sample_weight. In the case, that we have a solver, I'm under the impression that we would reapply the weights again to the metric. Do I miss something?

I think this was the default case that was present when we don't have any scorer. Here we calculate the squared errors simply using a line of code as is shown above. Do we require any params to correctly calculate the alpha_score using the squared errors?

So here would be such an example:

import numpy as np import sklearn from sklearn.linear_model import RidgeCV from sklearn.metrics import get_scorer from sklearn.pipeline import make_pipeline sklearn.set_config(enable_metadata_routing=True) rng = np.random.default_rng(0) coef = rng.uniform(size=(10,)) X = rng.normal(size=(1_000, 10)) y = X @ coef y[-1] = 1_000 # add outlier sample_weight = np.ones_like(y) sample_weight[-1] = 0.0 ridge_no_scoring = make_pipeline( RidgeCV(scoring=None).set_score_request( sample_weight=True ).set_fit_request(sample_weight=True) ) ridge_no_scoring.fit( X, y, sample_weight=sample_weight ) scorer = get_scorer("neg_mean_squared_error").set_score_request( sample_weight=True ) ridge_with_scorer = make_pipeline( RidgeCV(scoring=scorer).set_score_request( sample_weight=True ).set_fit_request(sample_weight=True) ) ridge_with_scorer.fit( X, y, sample_weight=sample_weight ) print( f"Ridge with no scoring: {ridge_no_scoring[-1].best_score_}\n" f"Ridge with a scorer: {ridge_with_scorer[-1].best_score_}" ) np.testing.assert_allclose( ridge_no_scoring[-1].coef_, ridge_with_scorer[-1].coef_ )

Ridge with no scoring: -4.0321032019694904e-08 Ridge with a scorer: -4.036139341310801e-08

I would have expected a much bigger difference if we would have not take into account the weight. So I'm wondering if this is not even something more subtle at the end. Here, we could increase the value of the outlier and we see that the error does not increase at all.

So looking closer, the following line is used to generate the predictions (as per written in http://cbcl.mit.edu/publications/ps/MIT-CSAIL-TR-2007-025.pdf)

predictions = y - (c / G_inverse_diag)

However, if we use the prediction function from the RidgeCV, we would have something equivalent to:

X @ safe_sparse_dot(c.T, X)

And the results are slightly different. I have to check more in details what is the reason.

So forget what ever I said before and @StefanieSenger was right from the start, this is just an issue of the sample_weight. So if we want to have the same call than with a scorer, we would need the following function:

def _score_without_scorer(self, squared_errors, score_params): """Performs scoring using squared errors when the scorer is None.""" axis = 0 if self.alpha_per_target else None weights = score_params.get("sample_weight", None) return -_average(squared_errors, axis=axis, weights=weights)

Basically mean will divide by the number of element while _average will divide by the sum of the sample_weight.

One thing to notice, if we use this function, it means that we are going to always forward sample_weight to the score function. So we don't have the possibility to have the same as get_scorer("neg_mean_squared_error").set_score_request(sample_weight=False) in this case.

I am confused about the premises to the discussion, which boils down to this question if sample_weight should be used consistently with metadata routing enabled only or also with it disabled.

When we enable metadata routing, do we want to mimic the old behaviour with it (meaning the different scores should re-appear) or would we route sample_weight even if it's not passed otherwise?

It seems that the desired result of the tests was to have the same scores if metadata routing is enabled, and accepting that it's not the same without the routing enabled, but then the discussion switched to talking about fixing the inconsistency of the original code?

Sorry, I have the impression that this is all glass clear to everyone but myself. I don't want to re-open any closed question, but only to retrospectively try to follow the discussion.

So we got a small sync talk with @StefanieSenger but here is a summary of whatever happen in this thread :). Posting it here to make sure that the thread become clearer.

Yep the discussion went sideways. Sorry about that. I used this thread as a debugging thread :).

So to summarize, I'm not to worry about the consistency between disabling/enabling metadata routing because we might not reach the proper behaviour without metadata routing and meta-estimator. So here, what is really important is to have the right behaviour for when enabling metadata-routing. It means:

when scoring is None, we need to make sure that sample_weight is taken into account. Basically, this is not configurable by the user since it does not pass a scorer.

so the behaviour of scoring is None is equivalent than passing a scorer using the mean squared error and setting the score parameter to accept sample weight. We should be able to test this equivalence.

For me this is the behaviour that we should make sure is right. So it results in adding sample_weight or score_params to the _score_without_scorer.

adrinjalali · 2024-02-29T13:36:28Z

@agramfort in case you have some bandwidth for this.

sklearn/linear_model/_ridge.py

glemaitre

Just a couple of nitpicks. Otherwise LGTM.

glemaitre · 2024-03-05T12:52:49Z

Thanks @OmarManzoor All good on my side.
Merging this one.

FEA Add metadata routing to RidgeCV and RidgeClassifierCV

0045533

github-actions bot added the module:linear_model label Oct 10, 2023

OmarManzoor added 2 commits October 10, 2023 17:25

Add changelog

4d0baf2

Fix errors

d32b693

OmarManzoor mentioned this pull request Oct 11, 2023

SLEP006 - Metadata Routing task list #22893

Open

28 tasks

adrinjalali reviewed Oct 11, 2023

View reviewed changes

sklearn/linear_model/_ridge.py Outdated Show resolved Hide resolved

sklearn/linear_model/_ridge.py Outdated Show resolved Hide resolved

sklearn/linear_model/_ridge.py Outdated Show resolved Hide resolved

OmarManzoor added 3 commits October 11, 2023 18:26

Updates: Partially address PR suggestions

297f983

Merge remote-tracking branch 'upstream/main' into ridge_metadata

520ce17

PR suggestion: adjust naming of score_params

1af6b43

glemaitre self-requested a review October 23, 2023 11:11

adrinjalali self-requested a review October 26, 2023 11:24

adrinjalali reviewed Oct 26, 2023

View reviewed changes

sklearn/linear_model/_ridge.py Outdated Show resolved Hide resolved

sklearn/linear_model/_ridge.py Outdated Show resolved Hide resolved

sklearn/linear_model/_ridge.py Outdated Show resolved Hide resolved

OmarManzoor added 2 commits October 26, 2023 18:04

Merge branch 'main' into ridge_metadata

1a49497

Updates: Address PR suggestions

f3aa98d

glemaitre reviewed Oct 27, 2023

View reviewed changes

OmarManzoor added 2 commits October 31, 2023 13:45

Merge branch 'main' into ridge_metadata

be4954a

Updates: partially address PR suggestions

fd4dc1d

adrinjalali self-requested a review December 4, 2023 11:56

adrinjalali reviewed Dec 4, 2023

View reviewed changes

OmarManzoor added 4 commits December 5, 2023 21:39

Updates: PR suggestions

0ca4d2f

Pass sample weight to the _RidgeGCV scorer even if routing is not ena…

8a0dc3f

…bled

Merge branch 'main' into ridge_metadata

6afb56f

Merge branch 'main' into ridge_metadata

a23eb62

adrinjalali reviewed Jan 4, 2024

View reviewed changes

doc/whats_new/v1.4.rst Outdated Show resolved Hide resolved

sklearn/linear_model/_ridge.py Outdated Show resolved Hide resolved

sklearn/linear_model/_ridge.py Outdated Show resolved Hide resolved

sklearn/linear_model/_ridge.py Outdated Show resolved Hide resolved

glemaitre self-requested a review January 9, 2024 15:07

OmarManzoor added 2 commits January 18, 2024 17:39

Merge branch 'main' into ridge_metadata

6bab07e

Refactor - *Add _get_scorer, *Add _score methods in _RidgeGCV

35e8752

adrinjalali approved these changes Feb 21, 2024

View reviewed changes

OmarManzoor added 2 commits February 28, 2024 18:02

Merge branch 'main' into ridge_metadata

7aa289d

Add minimal docstring to refactored methods

7a1a8c3

glemaitre reviewed Mar 4, 2024

View reviewed changes

sklearn/linear_model/_ridge.py Outdated Show resolved Hide resolved

glemaitre reviewed Mar 4, 2024

View reviewed changes

sklearn/linear_model/_ridge.py Outdated Show resolved Hide resolved

glemaitre reviewed Mar 4, 2024

View reviewed changes

sklearn/linear_model/_ridge.py Outdated Show resolved Hide resolved

glemaitre reviewed Mar 4, 2024

View reviewed changes

sklearn/linear_model/_ridge.py Outdated Show resolved Hide resolved

glemaitre approved these changes Mar 4, 2024

View reviewed changes

Omar Salman added 2 commits March 5, 2024 10:28

Merge branch 'main' into ridge_metadata

8080d57

Suggested changes

513b936

glemaitre merged commit ae7c3bd into scikit-learn:main Mar 5, 2024

OmarManzoor deleted the ridge_metadata branch March 6, 2024 06:56

adrinjalali mentioned this pull request Sep 12, 2024

FIX add metadata routing to CV splitters in RidgeCV and RidgeClassifierCV #29634

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEA Add metadata routing to RidgeCV and RidgeClassifierCV #27560

FEA Add metadata routing to RidgeCV and RidgeClassifierCV #27560

OmarManzoor commented Oct 10, 2023

github-actions bot commented Oct 10, 2023 •

edited

Loading

adrinjalali left a comment

OmarManzoor commented Jan 18, 2024

OmarManzoor commented Feb 13, 2024

adrinjalali left a comment

adrinjalali Feb 21, 2024

OmarManzoor Feb 28, 2024

StefanieSenger Sep 10, 2024

glemaitre Sep 11, 2024

OmarManzoor Sep 11, 2024

glemaitre Sep 11, 2024 •

edited

Loading

glemaitre Sep 11, 2024

glemaitre Sep 11, 2024 •

edited

Loading

StefanieSenger Sep 13, 2024 •

edited

Loading

glemaitre Sep 13, 2024

adrinjalali commented Feb 29, 2024

glemaitre left a comment

glemaitre commented Mar 5, 2024

	G_inverse_diag, c = solve(float(alpha), y, sqrt_sw, X_mean, *decomposition)
	if scorer is None:
	squared_errors = (c / G_inverse_diag) ** 2
	alpha_score = self._score_without_scorer(squared_errors=squared_errors)
	if self.store_cv_results:
	self.cv_results_[:, i] = squared_errors.ravel()

FEA Add metadata routing to RidgeCV and RidgeClassifierCV #27560

FEA Add metadata routing to RidgeCV and RidgeClassifierCV #27560

Conversation

OmarManzoor commented Oct 10, 2023

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

github-actions bot commented Oct 10, 2023 • edited Loading

✔️ Linting Passed

adrinjalali left a comment

Choose a reason for hiding this comment

OmarManzoor commented Jan 18, 2024

OmarManzoor commented Feb 13, 2024

adrinjalali left a comment

Choose a reason for hiding this comment

adrinjalali Feb 21, 2024

Choose a reason for hiding this comment

OmarManzoor Feb 28, 2024

Choose a reason for hiding this comment

StefanieSenger Sep 10, 2024

Choose a reason for hiding this comment

glemaitre Sep 11, 2024

Choose a reason for hiding this comment

OmarManzoor Sep 11, 2024

Choose a reason for hiding this comment

glemaitre Sep 11, 2024 • edited Loading

Choose a reason for hiding this comment

glemaitre Sep 11, 2024

Choose a reason for hiding this comment

glemaitre Sep 11, 2024 • edited Loading

Choose a reason for hiding this comment

StefanieSenger Sep 13, 2024 • edited Loading

Choose a reason for hiding this comment

glemaitre Sep 13, 2024

Choose a reason for hiding this comment

adrinjalali commented Feb 29, 2024

glemaitre left a comment

Choose a reason for hiding this comment

glemaitre commented Mar 5, 2024

github-actions bot commented Oct 10, 2023 •

edited

Loading

glemaitre Sep 11, 2024 •

edited

Loading

glemaitre Sep 11, 2024 •

edited

Loading

StefanieSenger Sep 13, 2024 •

edited

Loading