-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Common check for sample weight invariance with removed samples #16507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Common check for sample weight invariance with removed samples #16507
Conversation
The current output of this test is,
|
'check_sample_weights_invariance(kind=zeros)': | ||
'zero sample_weight is not equivalent to removing samples', | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course the amount of repetitions could be reduced by defining this tag in BaseSVC
but the expectations is that these estimators would be fixed one by one and it's easier to understand what needs fixing this way.
# skip tests marked as a known failure and raise a warning | ||
msg = xfail_checks[check_name] | ||
warnings.warn(f'Skipping {check_name}: {msg}', SkipTestWarning) | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a change following #16502 analogous to what was added in test_common.py::test_estimators
. Without this test_check_estimator_clones
started to fail with this PR, since it runs check_estimator
on MiniBatchKMeans
which now has one xfail test.
Skipping with a warning tests marked as xfail in check_estimator
, as done here, is a solution to this issue.
To answer the question why SVC fails even though it should have been fixed in #14286, the failures of,
are,
so LinearSVC and LinearSVR look definitely broken and they were not addressed in #14286 To make SVR, SVC, NuSVR, NuSVC one would need to increase the relative tolerance to 6e-4 which is quite a lot. The test added in that PR checked coefficients with rtol=1e-3. Maybe that's the best we can do for libsvm (any 32bit casting happening internally?), not sure.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's merge this and then we can fix estimators one by one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This control over checks is quite satisfying to see in action :)
Is there a reason we are modifying y and not X? Doing so means this check only works for supervised estimators.
sklearn/utils/estimator_checks.py
Outdated
err_msg = (f"For {name} sample_weight=None is not equivalent to " | ||
f"sample_weight=ones") | ||
elif kind == 'zeros': | ||
X2 = np.vstack([X, X]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deserves a comment like "Construct a dataset that is very different to (X, y) if weights are disregarded, but identical to (X, y) given weights".
However, this doesn't really work when the estimator is unsupervised. This could pass if weights are disregarded.
Good point, thanks! Also added a modification to X. |
'check_sample_weights_invariance(kind=zeros)': | ||
'zero sample_weight is not equivalent to removing samples', | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RidgeClassifierCV now also started to fail with a quite high tolerance required to pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely unrelated to this PR but since you've been doing this (which is helpful, thanks): do you think it'd be useful for github to support comments from PR author to reviewers, but something distinct from regular review comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely unrelated to this PR but since you've been doing this (which is helpful, thanks): do you think it'd be useful for github to support comments from PR author to reviewers, but something distinct from regular review comment?
Maybe, but I can't say the current way it works is an issue for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rth , a few comments and questions
'check_sample_weights_invariance(kind=zeros)': | ||
'zero sample_weight is not equivalent to removing samples', | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely unrelated to this PR but since you've been doing this (which is helpful, thanks): do you think it'd be useful for github to support comments from PR author to reviewers, but something distinct from regular review comment?
# skip tests marked as a known failure and raise a warning | ||
msg = xfail_checks[check_name] | ||
warnings.warn(f'Skipping {check_name}: {msg}', SkipTestWarning) | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we apply the xfail decorator as a function instead of manually replicating its behavior?
I.e.
check = pytest.mark.xfail(check)
or something like that?
Because we have the try / except logic just below (and we might want to update the comment about pandas)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't use pytest in this file, since it's supposed to work without it.
There is indeed some slight redundancy with _mark_xfail_checks but I think trying to factorize it might be more trouble than what it's worth. It's just 6 extra lines in the end.
Updated the comment about pandas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say the introduction of the tag is a good reason to start requiring pytest
for using check_estimator
now. But agreed that's another story.
It's just 6 extra lines in the end
Sure, though I find our whole test suite really hard to maintain in general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, though I find our whole test suite really hard to maintain in general.
Agreed, I'm just saying that slightly more verbosity and 2 repetitions is easier to maintain than coming up with some function wrappers. The problem is not so much lines of code as complexity.
sklearn/utils/estimator_checks.py
Outdated
if kind == 'ones': | ||
X2 = X | ||
y2 = y | ||
sw2 = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be more natural to set this to ones, and set the SW of estimator1 to None instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, fixed.
Co-Authored-By: Nicolas Hug <contact@nicolas-hug.com>
This reverts commit dcc52f0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @rth
# skip tests marked as a known failure and raise a warning | ||
msg = xfail_checks[check_name] | ||
warnings.warn(f'Skipping {check_name}: {msg}', SkipTestWarning) | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say the introduction of the tag is a good reason to start requiring pytest
for using check_estimator
now. But agreed that's another story.
It's just 6 extra lines in the end
Sure, though I find our whole test suite really hard to maintain in general.
@@ -816,7 +825,7 @@ def check_sample_weights_shape(name, estimator_orig): | |||
|
|||
|
|||
@ignore_warnings(category=FutureWarning) | |||
def check_sample_weights_invariance(name, estimator_orig): | |||
def check_sample_weights_invariance(name, estimator_orig, kind="ones"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this check be tested?
Well for common tests , there is a risk of false negatives and false positives,
Sorry I don't have much availability to write detailed tests for this at the moment. I think it's more useful to have this in master (original PR was 8 month ago) that leave it in its current state; also to avoid blocking #15554 Merging with +2. |
OK, no actually I should have re-run CI since there were outdated changes in check_estimator and the way xfail is handled. Was too optimistic. Fix in #17175 |
…t-learn#16507) Co-Authored-By: Nicolas Hug <contact@nicolas-hug.com>
…t-learn#16507) Co-Authored-By: Nicolas Hug <contact@nicolas-hug.com>
scikit-learn#16507)" This reverts commit 77279d6.
…t-learn#16507) Co-Authored-By: Nicolas Hug <contact@nicolas-hug.com>
scikit-learn#16507)" This reverts commit 77279d6.
…t-learn#16507) Co-Authored-By: Nicolas Hug <contact@nicolas-hug.com>
scikit-learn#16507)" This reverts commit 77279d6.
Continues and closes #15015
Closes #5515
This merges the common check which ensure that setting sample_weight to 0 is equivalent to removing samples. Estimators that currently fail it are listed in #16298 and are marked as a known failure.
We use the
_xfail_test
estimator tag to mark estimators that xfail this test.Also related to #11316, #15657