-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
ENH annotate metrics to simplify populating SCORERS #1774
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I actually implemented exactly this and we decided against it. |
I am not entirely sure why you say information is not duplicated. Why is it duplicated now? It is now stored in the scorer file, while you stored it at the scoring functions. Anyhow, I actually do like this approach but it was voted against it. |
Oh, well. That's annoying. Can you point me to the reasons against? I don't mean not duplicated. I mean not in multiple places. Assuming #1768, note with this patch we can do something like the following: class CompositeScorer(Scorer):
def __init__(self, objective, **other_metrics):
super(CompositeScorer, self).__init__(objective)
self.metrics = ([], []) # group metrics by their needs_threshold value
other_metrics['score'] = objective
for name, metric in six.iteritems(other_metrics):
needs_threshold = self._get_annotation(metric, 'needs_threshold', False)
self.metrics[int(needs_threshold)].append((name, metric))
def calc_scores(self, estimator, X, y=None):
for needs_threshold, metrics in enumerate(self.metrics):
if not metrics:
continue
if needs_threshold:
try:
y_pred = estimator.decision_function(X).ravel()
except (NotImplementedError, AttributeError):
y_pred = estimator.predict_proba(X)[:, 1]
else:
y_pred = estimator.predict(X)
for name, metric in metrics:
yield name, self.score_func(y, y_pred)
binary_debug_scorer = CompositeScorer(f1_score,
precision=precision_score, recall=recall_score,
accuracy=accuracy_score, hinge_loss=hinge_loss,
pr_curve=precision_recall_curve, roc_curve=roc_curve,
confusion=confusion_matrix) |
Hum, I never really looked into more than one score actually ;) |
@amueller Where was the vote? I like this approach better than the current one too. |
In the three pull requests I sent in and the massive thread on the ML. Actually the discussion there was if we should let the score_functions know how they can be applied. I worked pretty long and had endless discussion to get any working solution into sklearn and I am a bit fed up with the issue. The current state is probably not the perfect final solution, though. |
Alright. Shall we close this PR? |
Not sure ;) |
Right, so this comes down to "what is
With this patch it still performs the first function, but is not necessary for the second. With #1768 it also:
Things it does not do:
So, what's a |
The idea behind the scorer was that it can do more than just threshold. Have you read the other pull requests and the thread? Now you say you want multiple scores. I don't really see the use case for that, I must admit (I'd actually like to have less and more aggregated views of the results, not more verbose ones). But if you can implement what you want with the scorer interface, that is nice :) Btw, if you want to write less for the |
Yes of course it does... so it makes the metric not just a function of the input to the estimator, but of the estimator itself. No, I haven't read up on the history yet. It seems the problem comes down to the domain of With the Yes, I considered using |
Or perhaps the metrics should be annotated directly with something like: @applies_to('predict')
@applies_to('decision_function', 'predict_proba') :s For the moment I am persuaded that this PR might be ignored until we decide whether we're interested in composite scorers as in #1768 . |
I think that this pr can be close given the excellent work of many people (see ##2123). |
This patch makes
needs_threshold
andgreater_is_better
an attribute of each metric, rather than providing them when aScorer
constructed from them. I am not sure these are the most appropriate names, but their use means information isn't duplicated inscorer.py
. (It also would allow for compositeScorer
s to be easily constructed assuming their support in #1768 is granted.)Perhaps there should be other machine-readable annotations on our library of metrics:
is_symmetric
multiclass
/binary
/regression
but their application is not clear to me.