-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
CalibratedClassifierCV does not handle well sample_weight when ensemble=False #20610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Yep, I assume that we should not support this case since that currently we cannot pass any |
It seems to me that it is possible to pass I replaced the content of the
and it does correct the problem. Do you think it could have any negative side effect? |
Ah right, we pass using
I don't see any. As you mentioned this is what we do in the |
Do you wish to make a PR to implement the fix that you propose together with a test that check that the behaviour is fine. |
Yes, I have started to look at the guidelines for contributing to do so. |
@BenjaminBossan I think this is fixed with your PR in #24126 (or was fixed before). Could you please confirm? |
I confirm.
Le jeu., août 18, 2022 à 18:36, Adrin ***@***.***> a écrit:
@BenjaminBossan I think this is fixed with your PR in #24126 (or was fixed before). Could you please confirm?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
CalibratedClassifierCV
does not handle wellsample_weight
withensemble=False
In the
fit
method,sample_weight
is not passed tocross_val_predict
to generate the prediction scores (https://github.com/scikit-learn/scikit-learn/blob/2beed5584/sklearn/calibration.py#L325) whereas it is passed tofit
when the classifier is refitted on the entire dataset (https://github.com/scikit-learn/scikit-learn/blob/2beed5584/sklearn/calibration.py#L328).It makes the calibration to fail as the assumption that the classifiers built in each cv split of
cross_val_predict
has a similar behaviour as the one trained on the whole dataset at the end.To correct the bug, I suggest to pass
sample_weight
tocross_val_predict
using thefit_params
dictionaryExample to reproduce the issue:
Versions
System:
python: 3.7.10 (default, Feb 26 2021, 13:06:18) [MSC v.1916 64 bit (AMD64)]
executable: C:\HOMEWARE\Anaconda3-Windows-x86_64\envs\python37\python.exe
machine: Windows-10-10.0.18362-SP0
Python dependencies:
pip: 21.1.3
setuptools: 52.0.0.post20210125
sklearn: 0.24.2
numpy: 1.20.2
scipy: 1.6.2
Cython: None
pandas: 1.2.5
matplotlib: 3.3.4
joblib: 1.0.1
threadpoolctl: 2.1.0
Built with OpenMP: True
The text was updated successfully, but these errors were encountered: