Skip to content

Conversation

betatim
Copy link
Member

@betatim betatim commented Sep 2, 2025

Follow up to #31951

This PR enables strict xfail mode for just the common estimator checks. This allows us to find common tests that were marked as xfail but have started passing. If a test outcome changes we either need to update our xfail list or it is a bug that needs adressing.

As part of this I found some checks that no longer fail, the xfail list is updated. There are also a few estimators that are tested in different configurations, and fail/pass depending on the configuration. I've implemented that by adding custom logic to pop the check from the xfail list for some configs.

For some of the checks related to sample weights I wonder if. they really pass now or if the check is not strict enough to detect that the estimator doesn't full fill the requirements implied by the check? Maybe @ogrisel knows more about this?

ping @adrinjalali who thought it could be interesting to enable this for the scikit-learn test suite. What do you think?

This allows us to find common tests that were marked as xfail but have
started passing. If a test outcome changes we either need to update our
xfail list or it is a bug that needs adressing.
Copy link

github-actions bot commented Sep 2, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: cc8f605. Link to the linter CI: here

@@ -845,24 +842,6 @@ def _yield_instances_for_check(check, estimator_orig):


PER_ESTIMATOR_XFAIL_CHECKS = {
AdaBoostClassifier: {
# TODO: replace by a statistical test, see meta-issue #16298
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an instance where the check doesn't fail but I am not sure if this is because AdaBoostClassifier has been fixed or because the check is not "good enough" to detect that sample weight handling is broken?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible that the check is a bit too weak. Still we can remove the XFAIL markers for this and maybe readd it later when needed if we ever make the check stronger.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants