Skip to content

[MRG] Adds _MultimetricScorer for Optimized Scoring #14484

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 22 commits into from

Conversation

thomasjpfan
Copy link
Member

@thomasjpfan thomasjpfan commented Jul 26, 2019

Reference Issues/PRs

Fixes #10802
Alternative to #10979

What does this implement/fix? Explain your changes.

  1. This PR creates a _MultimetricScorer that subclasses dict which is used to reduce the number of calls to predict, predict_proba, and decision_function.

  2. The public interface of objects and functions using scoring are unchanged.

  3. The cache is only used when it is beneficial to use, as defined in _MultimetricScorer._use_cache.

Any other comments?

I do have plans to support custom callables that return dictionaries from the user. This was not included in this PR to narrow the scope of this PR to _MultimetricScorer.

@thomasjpfan thomasjpfan changed the title [MRG] Adds Multimetric Scorer [MRG] Extends Scorer for Multimetric Scoring Jul 26, 2019
@jnothman
Copy link
Member

Thanks for this @thomasjpfan!

Is there any reason not to just generically support scoring: Callable[[Estimator, X, y, ...], Dict[Str, Numeric]] for multiple metrics? Your multimetric scorer has a couple of additional public attributes: the score_infos and returns_dict. Would it be sufficient to dynamically determine the return type and keys when the scorer is called?

If so, I think this becomes conceptually simpler for users (the existing list and dict specifications for scoring become shorthands for such a callable), and doesn't involve introducing a custom type.

@thomasjpfan
Copy link
Member Author

@jnothman

Is there any reason not to just generically support scoring: Callable[[Estimator, X, y, ...], Dict[Str, Numeric]] for multiple metrics?

A user would require to build their callable to be smart about calling predict, predict_proba and decision_function. This does not allow a user to build effective multimetric scorers out of individual scikit-learn scorers.

Your multimetric scorer has a couple of additional public attributes: the score_infos and returns_dict.

This is internally used by _fit_and_score, cross_validate, and BaseSearchCV to inspect the scorer object. On master, this information is transferred by _check_multimetric_scoring through a dictionary and a bool. The dictionary gives the names of each scorer, and the bool is if it is multimetric. This PR extends _Scorer to wrap the information provided by _check_multimetric_scoring.

I have been considering making make_multimetric_scorer private and keeping the public interface exactly the same. Our private make_multimetric_scorer can do smart things with scorers made by make_scorer.

@amueller
Copy link
Member

Just had a discussion with @thomasjpfan with some ideas of how to simplify this.

What I find strange about the current design is that our scorers are used just to provide meta-data and I think I would like to represent that meta-data more explicitly.

@thomasjpfan thomasjpfan changed the title [MRG] Extends Scorer for Multimetric Scoring [WIP] Extends Scorer for Multimetric Scoring Jul 31, 2019
@jnothman
Copy link
Member

jnothman commented Jul 31, 2019 via email

@thomasjpfan
Copy link
Member Author

But it allows them to easily avoid repeated computation in something like
precision_recall_fscore_support... Essentially to efficiently reuse a
single confusion matrix computation.

Good point. Allowing users to be able to pass a callable that returns a dictionary would be pretty useful.

Would it be sufficient to dynamically determine the return type and keys when the scorer is called?

We can get away with this if the following did not exist:

if is_multimetric:
test_scores = dict(zip(scorer.keys(),
[error_score, ] * n_scorers))
if return_train_score:
train_scores = dict(zip(scorer.keys(),
[error_score, ] * n_scorers))

With all this feedback, this PR will be in flux, thus WIP.

@NicolasHug
Copy link
Member

Please ping me when you need reviews!

@thomasjpfan thomasjpfan changed the title [WIP] Extends Scorer for Multimetric Scoring [MRG] Adds _MultimetricScorer for Optimized Scoring Aug 7, 2019
@thomasjpfan
Copy link
Member Author

@NicolasHug

This is ready for a review.

  1. This PR creates a _MultimetricScorer that subclasses dict which is used to reduce the number of calls to predict, predict_proba, and decision_function.

  2. The public interface of objects and functions using scoring are unchanged.

  3. The cache is only used when it is beneficial to use, as defined in _MultimetricScorer._use_cache.

I do have plans to support custom callables that return dictionaries from the user. This was not included in this PR to narrow the scope of this PR to _MultimetricScorer.

@NicolasHug
Copy link
Member

@thomasjpfan do you need a thorough review or just feedback on the API? For the former I could use a user guide, some examples and some comments / docstrings ;)

@thomasjpfan
Copy link
Member Author

thomasjpfan commented Aug 7, 2019

For the former I could use a user guide, some examples and some comments / docstrings ;)

There are no public api changes. I added docstrings for the private _MultimetricScorer. This PR moves the multimetric scoring logic into metrics/scorer.py, which is a slightly more natural then doing the logic in model_selection/_validation.py.

@thomasjpfan
Copy link
Member Author

Closed and reappeared at #14593

@thomasjpfan thomasjpfan closed this Aug 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multi-metric scoring is incredibly slow because it repeats predictions for every metric
4 participants