-
-
Notifications
You must be signed in to change notification settings - Fork 26.1k
Description
Our documentation here states a callable with a (estimator, X, y)
is a valid scorer. However, it isn't.
In #31599, it is observed that passing such an object fails in the context of a _MultimetricScorer
.
While working on other metadata routing issues, I found that TunedThresholdClassifierCV
also fails with such an object, since it creates a _CurveScorer
which ignores the object and expects to just use the _score_func
of a given scorer object.
Consider the following script:
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import TunedThresholdClassifierCV, cross_val_score
from sklearn.datasets import make_classification
from sklearn.metrics._scorer import _Scorer, mean_squared_error, make_scorer
class MyScorer(_Scorer):
def _score(self, *args, **kwargs):
print("I'm logging stuff")
return super()._score(*args, **kwargs)
def my_scorer(estimator, X, y, **kwargs):
print("I'm logging stuff in my_scorer")
return mean_squared_error(estimator.predict(X), y, **kwargs)
def my_metric(y_pred, y_true, **kwargs):
print("I'm logging stuff in my_metric")
return mean_squared_error(y_pred, y_true, **kwargs)
my_second_scorer = make_scorer(my_metric)
X, y = make_classification()
# this prints logs
print("cross_val_score'ing")
cross_val_score(
LogisticRegression(),
X,
y,
scoring=MyScorer(mean_squared_error, sign=1, kwargs={}, response_method="predict"),
)
print("1. TunedThresholdClassifierCV'ing")
model = TunedThresholdClassifierCV(
LogisticRegression(),
# scoring=MyScorer(mean_squared_error, sign=1, kwargs={}, response_method="predict"),
# scoring=my_scorer,
scoring=my_second_scorer,
)
model.fit(X, y)
print("2. TunedThresholdClassifierCV'ing")
model = TunedThresholdClassifierCV(
LogisticRegression(),
scoring=MyScorer(mean_squared_error, sign=1, kwargs={}, response_method="predict"),
)
model.fit(X, y)
print("3. TunedThresholdClassifierCV'ing")
model = TunedThresholdClassifierCV(
LogisticRegression(),
scoring=my_scorer,
)
model.fit(X, y)
It includes 3 different ways of creating a scorer for TunedThresholdClassifierCV
. The first one works as expected, the second one ignores the implemented _score
function, and the third one raises an error:
cross_val_score'ing
I'm logging stuff
I'm logging stuff
I'm logging stuff
I'm logging stuff
I'm logging stuff
1. TunedThresholdClassifierCV'ing
I'm logging stuff in my_metric
(... repeated many more times ...)
2. TunedThresholdClassifierCV'ing
3. TunedThresholdClassifierCV'ing
Traceback (most recent call last):
File "/tmp/1.py", line 54, in <module>
model.fit(X, y)
File "/path/to/scikit-learn/sklearn/base.py", line 1366, in wrapper
return fit_method(estimator, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/scikit-learn/sklearn/model_selection/_classification_threshold.py", line 129, in fit
self._fit(X, y, **params)
File "/path/to/scikit-learn/sklearn/model_selection/_classification_threshold.py", line 743, in _fit
self._curve_scorer = self._get_curve_scorer()
^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/scikit-learn/sklearn/model_selection/_classification_threshold.py", line 880, in _get_curve_scorer
curve_scorer = _CurveScorer.from_scorer(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/path/to/scikit-learn/sklearn/metrics/_scorer.py", line 1141, in from_scorer
score_func=scorer._score_func,
^^^^^^^^^^^^^^^^^^
AttributeError: 'function' object has no attribute '_score_func'
(sklearn)
One could argue that MyScorer(mean_squared_error, sign=1, kwargs={}, response_method="predict")
is undocumented since it inherits from a private class, however, we have people using it, and have #31540 where making these public is discussed and seems we're on board with it.
This makes me think we really don't support a custom callable as a scorer, and that it should be removed from our documentation. The only way where a scorer works across the board, is when all custom work is happening inside the _score_func
, which is what we assume in the codebase as it is right now.