Skip to content

MNT Clean-up deprecations for 1.7: sample_weight as positional arg when not consumed #31119

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

jeremiedbb
Copy link
Member

Removed deprecated sample_weight as positional arg in estimators when it's only routed.

I haven't removed it from RANSAC yet because the fact that it's only routing sample weight is probably a bug and it should also be a consumer, see #15836. ping @ogrisel for confirmation. In which case, sample_weight should remain an explicit parameter, right ? ping @adrinjalali

@jeremiedbb jeremiedbb added No Changelog Needed Quick Review For PRs that are quick to review labels Apr 1, 2025
@jeremiedbb jeremiedbb added this to the 1.7 milestone Apr 1, 2025
Copy link

github-actions bot commented Apr 1, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 6f111d1. Link to the linter CI: here

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM.

Comment on lines -397 to -399
sample_weight = _check_sample_weight(sample_weight, X, dtype=None)
fit_params["sample_weight"] = sample_weight

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The allow suggestion didn't work here but I removed this validation because bagging doesn't consume sample_weight. So it should delegate validation to the underlying estimator.

Otherwise, if it turns out that bagging should be a consumer, we'll need to put back sample_weight as explicit param, right ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to keep sample_weight as explicit param for this release, as indeed we will need them for a PR fixing sample weight in Bagging estimators.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how that affects your work, can you elaborate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for #31165 where we use sample_weight for drawing the samples instead of passing them to the subestimators. Bagging is then is a consumer of sample_weight, and consumer-only as it should not pass them to underlying estimators.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting. I'd say we can merge this one, and your PR can add it as an explicit parameter. Nothing changes for the user's code, since they can pass sample_weight=sample_weight regardless of it being a part of kwargs or an explicit parameter.

@jeremiedbb
Copy link
Member Author

jeremiedbb commented Apr 9, 2025

Alright so following an irl discussion and this #31119 (comment), I've kept sample_weight as an explicit parameter for bagging and Ransac because both are known to have bugs regarding sample_weight handling and should also be consumers.

This PR is ready for a final review imo

Comment on lines -397 to -399
sample_weight = _check_sample_weight(sample_weight, X, dtype=None)
fit_params["sample_weight"] = sample_weight

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how that affects your work, can you elaborate?

@ogrisel ogrisel merged commit dcfb52b into scikit-learn:main Apr 15, 2025
36 checks passed
lucyleeow pushed a commit to EmilyXinyi/scikit-learn that referenced this pull request Apr 23, 2025
…en not consumed (scikit-learn#31119)

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Co-authored-by: Loïc Estève <loic.esteve@ymail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants