-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[WIP] Multiple-metric grid search #2759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…kit-learn into multiple_grid_search
…ikit-learn into multiple_grid_search
Conflicts: sklearn/cross_validation.py sklearn/feature_selection/rfe.py sklearn/grid_search.py sklearn/learning_curve.py sklearn/metrics/scorer.py sklearn/metrics/tests/test_score_objects.py sklearn/tests/test_grid_search.py
We are not supposed to use `parameters` outside of the loop. And this makes the code very difficult to read.
@amueller wrote:
I'd really like to see this happen. I'd happily attempt to complete the PR. What do you consider to still be lacking, @mblondel? I am however a little concerned that any code that attempts efficient calculation of multiple scorers (with the current definition of scorer) is going to be frameworkish to a great extent, and hence will be difficult to get merged. Is there some way to limit this? |
I am no longer working on this and would be glad if somebody could take |
Note: this PR fixes #1850. |
I still consider this feature sorely missing and of high priority. And seeing as it was possible to get multiple metrics back prior to the advent scorers (because there was no output checking on Obviously a lot of the codebase has changed since this PR was launched and much of the work yet to be done is transferring changes onto moved code. I do wonder whether there's a way to get it merged piece by piece in an agile way, or whether we just need a monolithic PR at the risk of growing stale again. Certainly, some of the auxiliary changes to I think @mblondel has made some reasonable API decisions here, but we should decide on the following: I think the only substantial question is whether scores should be a dict {scorer_name: score_data_structure} (for all of The other issue is @amueller and others, do you have an opinion on these API issues? Apart from those things, what remains to be done appears to be: moving the changes to the current codebase; ensuring test coverage; documentation; and an example or two. |
Hi everyone, it seems that the scorer API has changed a lot since the PR has started. I have tried to follow the discussions in issues 1850 and pull 2123 and would like to ask your opinion on working on this enhancement. I understand that this touches a lot of API and there needs to be strong decision from the core devs. I would like to know if there is a possibility for a newbie to work on this ? Thanks. |
I don't think the scorer API has changed much, no. |
Oh really sorry, I have been going through all the related PRs for some time and confused it with the changes in |
I think it would be a decent thing to work on, if you're comfortable with that part of the code base. |
I have looked into the main aim of this PR, coming from here and the discussion at #1850 but if it is okay, could I ask about the decision regarding :
Sorry for asking many doubts. I am not that aware of the practical use cases and am trying to get an idea based on the discussions here. Thanks again for patiently answering. |
Sorry @rvraghav93 , I didn't know that you intended to work in this issue. I just thought this to be challenging and also important, that's why looked at it. |
Yes. Myself and @amueller had a discussion on this. I've also let @GaelVaroquaux know that I will be working on this after my Tree PRs. |
yay for this happening! On 21 March 2016 at 04:22, Raghav R V notifications@github.com wrote:
|
Something that we really need to support in this work (ping @rvraghav93 if you see yourself as doing this) is returning per-class performance for multilabel and multiclass problems without making a scorer for each class. At first glance I frankly have no idea how to do this neatly. |
To make myself clearer we are having 2 problems to face here?
What will we do for the case of multi-metric multi-label? Can I suggest that we have the notation of Either that or we have to agree on using list of arrays for a key of |
Even with this approach we might have to be okay with list of arrays for a key In which case we can allow list of 2d arrays itself? But how do we rank them in that case? |
With the implemented metrics, performance for any particular class for any particular metric is currently a scalar. (However, But we should first worry about implementing multiple metric scoring where each metric returns one value. If need be (with a runtime and API specification cost), class-wise metrics are all able to be specified to return a single scalar, just by wrapping the scorer in |
@jnothman Thanks for the comment. I'll raise a PR soon. (Hopefully by this weekend while my other PRs are pending for review/comments ;( ) |
Fixed in #7388 |
This PR brings multiple-metric grid search. This is important for finding the best-tuned estimator on a per metric basis without redoing the grid / randomized search from scratch for each metric.
Highlights:
cross_val_score
so as to support lists asscoring
parameter. In this case, a 2d array of shape(n_scoring, n_folds)
is returned instead of a 1d array of shape(n_folds,)
.GridSearchCV
andRandomizedSearchCV
._evaluate_scorers
for computing several scorers without recomputing the predictions every time.need_threshold=True
.Tagging this PR with the v0.15 milestone.