-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Fix sample weight handling in scoring _log_reg_scoring_path #29419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix sample weight handling in scoring _log_reg_scoring_path #29419
Conversation
Thanks for the PR. Could you please add a non regression test based on the reproducer provided in #29416 and make it run on all solvers with a small value of |
Note: in the test_logistic_regression_sample_weight sag and saga are left out as they systematically fail |
@snath-xoc could you please push a commit that triggers running the updated test in this PR for all admissible values of EDIT: before testing on the CI, you should test locally with commands like the following:
at the moment, the test fails for many seeds but this can be fixed as explained in the review comments below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is some feedback to help move this PR forward. Please don't forget to add an entry to the changelog.
I find this test too long to follow. You might want to decouple the part that tests equivalence between integer sample weights and their repeated counter part from the part of the test that checks equivalence between class weights and sample weights into two independent tests. |
To actually trigger |
test_logistic_regression_sample_weights
test_logistic_regression_sample_weights
test_logistic_regression_sample_weights
test_logistic_regression_sample_weights
test_logistic_regression_sample_weights
test_sample_and_class_weight_equivalence_liblinear test_logistic_regression_sample_weights test_logistic_regression_solver_class_weights
test_sample_and_class_weight_equivalence_liblinear test_logistic_regression_sample_weights test_logistic_regression_solver_class_weights
test_sample_and_class_weight_equivalence_liblinear test_logistic_regression_sample_weights test_logistic_regression_solver_class_weights
test_sample_and_class_weight_equivalence_liblinear test_logistic_regression_sample_weights test_logistic_regression_solver_class_weights
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise LGTM. Thanks @snath-xoc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more minor feedback on top of Omar's but otherwise, LGTM!
Note for later: the def _more_tags(self):
return {
"_xfail_checks": {
"check_sample_weights_invariance": (
"zero sample_weight is not equivalent to removing samples"
),
}
} However This should better be done in a dedicated PR though. |
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Omar Salman <omar.salman2007@gmail.com>
Merged! Thanks very much @snath-xoc! |
Reference Issues/PRs
Fixes #29416
What does this implement/fix? Explain your changes.
Added sample weighting for test set into default calculation of scores within _log_reg_scoring_path
TO DO: