Cross-validation supports optional sample weights #7112

ilyaeck · 2016-07-29T06:49:32Z

This fix enables use of sample weights in cross validation, i.e., cross_val_score and cross_val_predict. Since most classifiers (estimators) do explicitly allow sample weights in fit(), this capability is required to adequately measure performance in those cases.

…ample-weights

This fix enables use of sample weights in cross validation, i.e., cross_val_score and cross_val_predict. Since most classifiers (estimators) do explicitly allow sample weights in fit(), this capability is required to adequately measure performance in those cases.

…ample-weights

ilyaeck · 2016-07-29T07:38:03Z

Failed tests should now be fixed.

GaelVaroquaux · 2016-07-29T07:42:12Z

sklearn/cross_validation.py

+from sklearn.metrics.scorer import check_scoring
+from sklearn.utils.fixes import bincount
+from sklearn.gaussian_process.kernels import Kernel as GPKernel
+from sklearn.exceptions import FitFailedWarning


Why did you change this? Imports should always be relative.

GaelVaroquaux · 2016-07-29T07:43:43Z

Thanks for your PR. It cannot be merged as it is.

You should modify the files in the model_selection module and leave the cross_validation.py file untouched, as it is there only for legacy.

You need to add tests for the new functionality.

You shouldn't change things like relative imports.

amueller · 2016-07-29T18:32:37Z

there is a fit_params parameter that can contain sample_weights. So what you want to do can already be achieved, right?

ilyaeck · 2016-07-30T00:15:30Z

For prediction, I think you are right, but for not for scoring. In other
words, in 'cross_val_score', you can inject sample weights into estimator
fitting, but you cannot do weighted scoring. Does that make sense?

I'll try to address Gael's comments and resubmit the PR.

Ilya

On Fri, Jul 29, 2016 at 11:33 AM, Andreas Mueller notifications@github.com
wrote:

there is a fit_params parameter that can contain sample_weights. So what
you want to do can already be achieved, right?

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#7112 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAJklgRJlE_tBfqAb9TicrU8BCwCxtRMks5qakd4gaJpZM4JX6-u
.

Ilya

seanv507 · 2018-10-09T07:57:06Z

@ilyaeck what happened to this pull request? does fit_params handle weighted scoring? I am interested in the case that the sample weights are counts (for grouped data - to reduce memory/computation). In this case AFAIK, the crossvalidation process itself should be changed: ie to replicate uniformly sampling the ungrouped data.

ilyaeck · 2018-10-09T08:13:04Z

It never went through, so nothing happened I'm afraid.

…

On Tue, Oct 9, 2018 at 12:59 AM seanv507 ***@***.***> wrote: @ilyaeck <https://github.com/ilyaeck> what happened to this pull request? does fit_params handle weighted scoring? I am interested in the case that the sample weights are counts (for grouped data - to reduce memory/computation). In this case AFAIK, the crossvalidation process itself should be changed: ie to replicate uniformly sampling the ungrouped data. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7112 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAJklsQjo0qtwFh5vs_JAaBIQlSaC_9-ks5ujFd6gaJpZM4JX6-u> .

-- *Ilya*

amueller · 2019-08-05T19:44:14Z

closely related to #4497. I think this can safely be closed as too many things changed in the meantime. This is quite a tricky issue.

ilyaeck added 4 commits July 28, 2016 11:10

Merge remote-tracking branch 'scikit-learn/master' into metrics-use-s…

cbdc379

…ample-weights

Merge remote-tracking branch 'scikit-learn/master' into metrics-use-s…

059c4a7

…ample-weights

Bug fix

5801206

GaelVaroquaux reviewed Jul 29, 2016
View reviewed changes

amueller added the Needs Decision Requires decision label Aug 5, 2019

amueller closed this Aug 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cross-validation supports optional sample weights #7112

Cross-validation supports optional sample weights #7112

ilyaeck commented Jul 29, 2016

ilyaeck commented Jul 29, 2016

GaelVaroquaux Jul 29, 2016

GaelVaroquaux commented Jul 29, 2016

amueller commented Jul 29, 2016

ilyaeck commented Jul 30, 2016

seanv507 commented Oct 9, 2018

ilyaeck commented Oct 9, 2018 via email

amueller commented Aug 5, 2019

Cross-validation supports optional sample weights #7112

Cross-validation supports optional sample weights #7112

Conversation

ilyaeck commented Jul 29, 2016

ilyaeck commented Jul 29, 2016

GaelVaroquaux Jul 29, 2016

Choose a reason for hiding this comment

GaelVaroquaux commented Jul 29, 2016

amueller commented Jul 29, 2016

ilyaeck commented Jul 30, 2016

seanv507 commented Oct 9, 2018

ilyaeck commented Oct 9, 2018 via email

amueller commented Aug 5, 2019