You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Following on from #14593, and an enabler for #12385.
A user is currently able to specify multiple metrics for scoring in cross validation or model selection, by setting scoring to a dict, mapping names to scoring functions/callables. We also allow for the shorthand of specifying multiple predefined scorers by a list of names.
Following on from #14593, it should be relatively easy to allow the user to instead provide a callable which has the same input parameters as a scorer (estimator, X, y, ...) but which returns a dict mapping names to scores.
This would allow the computation of scores to be more efficient (e.g. computing confusion matrices once then aggregating them in various ways), and also enables us (or third parties) to provide prefabricated scorer collections (per #12385), which would amount to a diagnostic suite for your cross validation performance.
@thomasjpfan had expressed interest in implementing this but I thought I'd make an issue so it doesn't get lost and so that we can let Thomas focus on other things if there is a contributor willing to do it.
Note: care will need to be taken to handle the case that the scorer returns dicts with different keys in different calls. Either this should trigger an error or it should leave some results undefined.
The text was updated successfully, but these errors were encountered:
I have an implementation of this that I want to polish up before submitting a PR. I am hoping to enable this feature while making the multimetric code easier to maintain.
Note: Another thing is how refit can be a string, which could be passed to _fit_and_score so it can check dynamically if the key is in the dictionary returned by the callable. (Trigger an error after the search completed seems bad.)
Note: care will need to be taken to handle the case that the scorer returns dicts with different keys in different calls. Either this should trigger an error or it should leave some results undefined.
I think it should error. Again, triggering an error after the search completed seems bad. It would be very nice to have a global mapping of job_id to keys, so each job can check if the keys returned are consistent and error.
Because it's such an exceptional thing for the user to pass a callable that
does not give consistent keys, I'd focus more on usability than being
adamant that an error is the way to go.
Following on from #14593, and an enabler for #12385.
A user is currently able to specify multiple metrics for scoring in cross validation or model selection, by setting
scoring
to a dict, mapping names to scoring functions/callables. We also allow for the shorthand of specifying multiple predefined scorers by a list of names.Following on from #14593, it should be relatively easy to allow the user to instead provide a callable which has the same input parameters as a scorer (estimator, X, y, ...) but which returns a dict mapping names to scores.
This would allow the computation of scores to be more efficient (e.g. computing confusion matrices once then aggregating them in various ways), and also enables us (or third parties) to provide prefabricated scorer collections (per #12385), which would amount to a diagnostic suite for your cross validation performance.
@thomasjpfan had expressed interest in implementing this but I thought I'd make an issue so it doesn't get lost and so that we can let Thomas focus on other things if there is a contributor willing to do it.
Note: care will need to be taken to handle the case that the scorer returns dicts with different keys in different calls. Either this should trigger an error or it should leave some results undefined.
The text was updated successfully, but these errors were encountered: