FIX `LogisticRegressionCV.score` and `_BaseScorer` metadata routing #30859

adrinjalali · 2025-02-19T12:30:31Z

Fixes #30817

Depends on #31891, #31898

Two issues fixed in this PR:

LogisticRegressionCV had a sample_weight arg in its score, which makes it a consumer of it, while being a router. This PR removes sample_weight as a consumer arg from that method
_BaseScorer wasn't implementing a get_metadata_routing and as a result the default implementation wasn't correctly detecting sample_weight in the __call__ signature of those scorers

Needs tests and refining the error message regarding scorer.__call__

cc @Dalesrox, @antoinebaker

github-actions · 2025-02-19T12:32:01Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: cab1492. Link to the linter CI: here}

StefanieSenger · 2025-02-19T12:32:37Z

How did you discover these bugs?

Edit: he it was related to the issue, sorry didn't see in time.

adrinjalali · 2025-03-10T19:22:02Z

sklearn/metrics/tests/test_score_objects.py

+        err_msg = re.escape(
+            "[sample_weight] are passed but are not explicitly set as requested or not"
+            " requested for _Scorer.score, which is used within test.score. Call"
+            " `_Scorer.set_score_request({metadata}=True/False)` for each"


we can replace _Scorer with __repr__ once #30946 is merged.

adrinjalali · 2025-03-10T20:45:32Z

sklearn/linear_model/_logistic.py

@@ -1762,6 +1764,8 @@ class LogisticRegressionCV(LogisticRegression, LinearClassifierMixin, BaseEstima
    0.98...
    """

+    # TODO(1.9): remove this when sample_weight is removed from the `score` signature
+    __metadata_request__score = {"sample_weight": metadata_routing.UNUSED}


this is to avoid set_score_request be present on this class

adrinjalali · 2025-03-10T20:46:02Z

sklearn/linear_model/tests/test_logistic.py

@@ -2262,18 +2262,18 @@ def test_lr_cv_scores_differ_when_sample_weight_is_requested():
    sample_weight[: len(y) // 2] = 2
    kwargs = {"sample_weight": sample_weight}

-    scorer1 = get_scorer("accuracy")
+    scorer1 = get_scorer("accuracy").set_score_request(sample_weight=False)


now that scorers properly request their routing, this is required since we're passing sample weight bellow.

adrinjalali · 2025-03-10T20:47:10Z

sklearn/metrics/_scorer.py

+                score_method=self._score_func,
+                ignore_params={"y_true", "y_pred"},


we explicitly pass the _score_func so that the right metadata can be deducted from its signature.

And we need to ignore y_true and y_pred in that process.

Do we need to ignore y_prob, y_proba, y_score, labels_true, labels_pred, pred_decision as well ? (Some of the various names the first two args of a score function can have). Maybe an easier way to skip them all would be to ignore the first two positional arguments of the score function ?

I don't feel safe skipping the first 2 params. but adding some more to the list.

adrinjalali · 2025-03-10T20:48:39Z

@OmarManzoor @antoinebaker this is ready for review now.

OmarManzoor

Thank you for the PR @adrinjalali

sklearn/linear_model/_logistic.py

OmarManzoor · 2025-03-11T05:51:49Z

sklearn/linear_model/_logistic.py

@@ -2231,13 +2242,14 @@ def score(self, X, y, sample_weight=None, **score_params):
            Score of self.predict(X) w.r.t. y.
        """
        _raise_for_params(score_params, self, "score")
+        if sample_weight is not None:
+            score_params["sample_weight"] = sample_weight


Since we intend on removing sample_weight then shouldn't we also update the condition where routing is not enabled to be:
if "sample_weight " in score_params: instead of if sample_weight is not None:

not really, when removed, there won't be any sample_weight here and these two lines also get removed. Added a comment for it.

antoinebaker

A first round of reviews, but from my limited understanding of the metadata routing API, I probably don't get all the logic right 😉

antoinebaker · 2025-03-18T14:33:35Z

sklearn/metrics/_scorer.py

+                score_method=self._score_func,
+                ignore_params={"y_true", "y_pred"},


Do we need to ignore y_prob, y_proba, y_score, labels_true, labels_pred, pred_decision as well ? (Some of the various names the first two args of a score function can have). Maybe an easier way to skip them all would be to ignore the first two positional arguments of the score function ?

sklearn/linear_model/tests/test_logistic.py

antoinebaker · 2025-03-18T14:58:51Z

sklearn/utils/_metadata_requests.py

+                cls._build_request_for_signature(
+                    method_name=method,
+                    method_obj=score_method if score_method != "score" else None,
+                    ignore_params=ignore_params,
+                ),


Out of curiosity, why the generic _get_default_requests and _build_request_for_signature need to be modified specifically for scorers ? Is it because in general we rely on the method signature, but here for scorers we instead rely on the scorer._scorer_func signature ?

If so, I am wondering if we should rather redefine _get_default_requests and _build_request_for_signature in _BaseScorer (instead of changing the _MetadataRequestermixin).

we could do that, but then we'd be having quite a bit of copy/pasted code, and I think we rather not? But this makes me realise we should override _get_metadata_request in _BaseScorer instead.

antoinebaker · 2025-03-18T15:11:35Z

sklearn/metrics/_scorer.py

+    # TODO (1.9): remove in 1.9
+    @_deprecate_positional_args(version="1.9")
+    def __call__(self, estimator, X, y_true, *, sample_weight=None, **kwargs):


Out of curiosity, why do we need to deprecate sample_weight as a positional arg ? Is it directly related to this PR or is it a general guideline ?

sample_weight is nothing special here, it's just metadata passed down the line. There's no reason to have it explicitly in the signature, since it only complicates code.

adrinjalali · 2025-07-30T13:29:20Z

From a sync discussion with @lesteve , related to #31599

This code should fail with the expected error message (to be checked against this PR)

import numpy as np
from sklearn.metrics import make_scorer, mean_squared_error
from sklearn.linear_model import Ridge
from sklearn.model_selection import GridSearchCV
import sklearn

sklearn.set_config(enable_metadata_routing=True)

def score_func_1(y_pred, y_true, sample_weight=None):
    print(f"sample_weight is None? {sample_weight is None}")
    return 1

custom_scorer = make_scorer(score_func_1)

def score_func(estimator, X, y, sample_weight=None, **kws):
    print(f"sample_weight is None? {sample_weight is None}")
    return 1


scorers = {
    "mse": make_scorer(mean_squared_error, greater_is_better=False).set_score_request(
        sample_weight=True
    ),
    "default": score_func,
}

rng = np.random.RandomState(0)
X = rng.rand(10, 2)
y = rng.rand(10)
sample_weight = rng.rand(10)

gs = GridSearchCV(
    Ridge().set_fit_request(sample_weight=True),
    scoring=custom_scorer,
    param_grid={"alpha": [0.1, 1, 10]},
    refit=False,
)
gs.fit(X, y, sample_weight=sample_weight)

EDIT

The above code here gives:

    raise UnsetMetadataPassedError(
sklearn.exceptions.UnsetMetadataPassedError: [sample_weight] are passed but are not explicitly set as requested or not requested for _Scorer.score, which is used within GridSearchCV.fit. Call `_Scorer.set_score_request({metadata}=True/False)` for each metadata you want to request/ignore. See the Metadata Routing User guide <https://scikit-learn.org/stable/metadata_routing.html> for more information.

adrinjalali · 2025-08-11T11:45:49Z

Issue right now for reference:

TunedThresholdClassifierCV has balanced_accuracy as default for scoring, which means it doesn't request sample_weight by default, which means

TunedThresholdClassifier(estimator).fit(X, y, sample_weight=blah) fails by default, and the fix is TunedthresholdClassifierCV(estimator, scoring=get_scorer("balanced_accucary").set_score_request(sample_weight=True)).fit(...) which is not very intuitive.

FIX LogisticRegressionCV.score and _BaseScorer metadata routing

e17acbb

github-actions bot added module:linear_model module:metrics module:utils labels Feb 19, 2025

adrinjalali mentioned this pull request Mar 4, 2025

Pipeline score asks to explicitly request sample_weight #30937

Open

adrinjalali added 4 commits March 10, 2025 18:53

...

38e18d6

...

8c3bf31

Merge remote-tracking branch 'upstream/main' into logregcv_score

21e4907

changelog

76c7e31

adrinjalali commented Mar 10, 2025

View reviewed changes

adrinjalali marked this pull request as ready for review March 10, 2025 20:48

OmarManzoor reviewed Mar 11, 2025

View reviewed changes

antoinebaker reviewed Mar 18, 2025

View reviewed changes

adrinjalali mentioned this pull request Jul 31, 2025

_MultimetricScorer deals with _accept_sample_weights inconsistently #31599

Open

adrinjalali added 4 commits August 6, 2025 15:09

Merge remote-tracking branch 'upstream/main' into logregcv_score

71fd4ff

reviews

c2b2799

some progress

82cdb1a

FIX make scorer.repr work with a partial score_func

0ede482

adrinjalali mentioned this pull request Aug 7, 2025

FIX make scorer.repr work with a partial score_func #31891

Merged

adrinjalali added 7 commits August 7, 2025 14:23

add changelog

aa4c3ef

Merge branch 'metci/partial' into logregcv_score

a40c7b7

FIX make sure _PassthroughScorer works with meta-estimators

63569a6

changelog

d6218f7

Merge remote-tracking branch 'upstream/main' into slep6/pipeline/score

1ac76e2

Merge remote-tracking branch 'upstream/main' into slep6/pipeline/score

c8057b5

Merge remote-tracking branch 'upstream/main' into logregcv_score

783e736

Merge branch 'slep6/pipeline/score' into logregcv_score

52653ee

TunedThresholdClassifierCV issue

cab1492

adrinjalali added this to Metadata routing Aug 11, 2025

adrinjalali moved this to To Review in Metadata routing Aug 11, 2025

		score_method=self._score_func,
		ignore_params={"y_true", "y_pred"},

Uh oh!

FIX LogisticRegressionCV.score and _BaseScorer metadata routing #30859

Are you sure you want to change the base?

FIX LogisticRegressionCV.score and _BaseScorer metadata routing #30859

Uh oh!

Conversation

adrinjalali commented Feb 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

StefanieSenger commented Feb 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

antoinebaker Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adrinjalali commented Mar 10, 2025

Uh oh!

OmarManzoor left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

antoinebaker left a comment

Choose a reason for hiding this comment

Uh oh!

antoinebaker Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

antoinebaker Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adrinjalali commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adrinjalali commented Aug 11, 2025

Uh oh!

Uh oh!

FIX `LogisticRegressionCV.score` and `_BaseScorer` metadata routing #30859

FIX `LogisticRegressionCV.score` and `_BaseScorer` metadata routing #30859

adrinjalali commented Feb 19, 2025 •

edited

Loading

github-actions bot commented Feb 19, 2025 •

edited

Loading

StefanieSenger commented Feb 19, 2025 •

edited

Loading

antoinebaker Mar 18, 2025 •

edited

Loading

antoinebaker Mar 18, 2025 •

edited

Loading

antoinebaker Mar 18, 2025 •

edited

Loading

adrinjalali commented Jul 30, 2025 •

edited

Loading