-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
ENH Adds support to xfail in check_estimator. #16963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH Adds support to xfail in check_estimator. #16963
Conversation
I'm just concerned about complexity of that test logic. IMO if the test is marked as xfail it should be skipped in check_estimator (possibly with a warning) and that's it. If users want nuanced xfail behavior they should use a proper testing framework i.e. pytest with the Sorry I should have commented on the issue earlier. |
Since there is indeed a bug, I would just propose to bypass the test with possibly a warning that it was skipped. That's what We are using more advanced pytest features to run the tests and check the output status as XPASS/XFAIL, but we shouldn't need to re-implement it for check_estimator IMO. Or anything else that allows fixing the linked issue without adding too much complexity/code. |
+1 to keep the logic simple here and just not run the checks marked as xfail. |
Updated PR with (hopefully) simpler logic. |
# When a _xfail_checks check passes, raise an error stating that the test | ||
# passes and can be removed from the tag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding was that we dont't care about that scenario because we don't want to reimplement pytest's logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to issuing a warning.
From an implementation point of view, it only takes 2 lines to implement this logic in _check_warns_on_fail
. I would not want to keep silent about this, since a developer would not know if a xfail check is now passing.
sklearn/utils/estimator_checks.py
Outdated
if not hasattr(estimator, "_get_tags"): | ||
return {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed? estimators inheriting from BaseEstimator should have a _get_tags
method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following
assert_raises_regex(TypeError, msg, check_estimator, object) |
would fail at the incorrect place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just change the test then. Which is what I did for the one just below. The previous error message wasn't that great either anyway
sklearn/utils/estimator_checks.py
Outdated
"""Get xfail_check from estimator""" | ||
if isinstance(estimator, type): | ||
# try to construct estimator instance, if it is unable to then | ||
# return then ignore xfail tag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# return then ignore xfail tag | |
# ignore xfail tag |
sklearn/utils/estimator_checks.py
Outdated
@@ -331,6 +332,44 @@ def _construct_instance(Estimator): | |||
return estimator | |||
|
|||
|
|||
def _get_xfail_checks(estimator): | |||
"""Get xfail_check from estimator""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Get xfail_check from estimator""" | |
"""Get checks marked with xfail_checks tag from estimator""" |
sklearn/utils/estimator_checks.py
Outdated
"""Mark (estimator, check) pairs with xfail according to the | ||
_xfail_checks_ tag""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs an update now
sklearn/utils/estimator_checks.py
Outdated
return estimator._get_tags()['_xfail_checks'] or {} | ||
|
||
|
||
def _check_warns_on_fail(estimator, check, xfail_checks_tag): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to pass the estimator? It looks like the logic only affects the check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was not needed.
To address https://github.com/scikit-learn/scikit-learn/pull/16963/files#r414661838 it would be needed now.
sklearn/utils/estimator_checks.py
Outdated
return estimator._get_tags()['_xfail_checks'] or {} | ||
|
||
|
||
def _check_warns_on_fail(estimator, check, xfail_checks_tag): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call this _make_check_warn_on_fail
?
sklearn/utils/estimator_checks.py
Outdated
warnings.warn(f"{check_name} passed, it can be removed " | ||
f"from the _xfail_check tag", SkipTestWarning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we also print the estimator name?
sklearn/utils/estimator_checks.py
Outdated
# did not fail | ||
check_name = _set_check_estimator_ids(check) | ||
warnings.warn(f"{check_name} passed, it can be removed " | ||
f"from the _xfail_check tag", SkipTestWarning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a FutureWarning?
sklearn/utils/estimator_checks.py
Outdated
|
||
|
||
def _check_warns_on_fail(estimator, check, xfail_checks_tag): | ||
"""Convert assertion errors to warnings if reason is given""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably provide more details here:
"""Convert assertion errors to warnings if reason is given""" | |
""" | |
Wrap the check so that a warning is raised: | |
- when the check is in the `xfail_checks` tag and the check properly failed as expected | |
- when the check is in the `xfail_checks` tag and the check didnt fail but should have | |
Checks that aren't in the xfail_checks tag aren't wrapped and are returned as-is. | |
This wrapper basically simulates what pytest would do with the @xfail decorator, but this one can be used with check_estimator() which doesn't rely on pytest. | |
""" |
sklearn/utils/estimator_checks.py
Outdated
return estimator, check | ||
|
||
reason = xfail_checks_tag[check_name] | ||
name = _get_estimator_name(estimator) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name = _get_estimator_name(estimator) | |
est_name = _get_estimator_name(estimator) |
sklearn/utils/estimator_checks.py
Outdated
@@ -333,42 +334,100 @@ def _construct_instance(Estimator): | |||
return estimator | |||
|
|||
|
|||
def _get_estimator_name(estimater): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _get_estimator_name(estimater): | |
def _get_estimator_name(estimator): |
sklearn/utils/estimator_checks.py
Outdated
# try to construct estimator instance, if it is unable to then | ||
# return xfail tag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# try to construct estimator instance, if it is unable to then | |
# return xfail tag | |
# try to construct estimator instance, if it is unable to then | |
# xfail_checks tag is ignored``` |
# skips check_estimators_fit_returns_self based on _xfail_checks | ||
|
||
assert_warns_message(SkipTestWarning, "This is a bad check", | ||
check_estimator, LRXDoesNotRaise) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pass an instance?
assert_warns_message(FutureWarning, | ||
"LRXFailTags:check_complex_data passed, it can " | ||
"be removed from the _xfail_check tag", | ||
check_estimator, LRXFailTags) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pass an instance?
def fit(self, X, y): | ||
# do not raise error for complex check | ||
try: | ||
return super().fit(X, y) | ||
except ValueError as e: | ||
if "Complex data not supported" not in str(e): | ||
raise | ||
|
||
return self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I don't understand what this class and the check below are supposed to test
Also what's the relation with check_estimators_fit_returns_self?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment needed to be updated to:
# skips check_complex_data based on _xfail_checks
@@ -592,6 +594,42 @@ def __init__(self, special_parameter): | |||
check_estimator, MyEstimator) | |||
|
|||
|
|||
class LRXFailTags(LogisticRegression): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be called LRXFailTagButCheckPasses
|
||
for estimator, check in checks_generator: | ||
check_name = _set_check_estimator_ids(check) | ||
if check_name in xfail_checks_tag: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to remove the previous comments? These can't hurt IMO
return self | ||
|
||
|
||
def test_check_estimator_xfail_tag_skips(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that check_estimator does not skip anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This checks that it raises a "SkipTestWarning". Renamed to test_check_estimator_xfail_tag_raises_skip_test_warning
.
|
||
|
||
def test_check_estimator_xfail_tag_skips(): | ||
# skips check_complex_data based on _xfail_checks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we just check on an estimator with a properly set xfail_tags
instead, like the Dummy one?
I don't understand why we need to create a new estimator here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that if the estimator gets fixed and the xfail_tags
gets removed this test would start failing.
Changed to use Dummy.
I think I'd be OK with the proposed changes but I still agree with @rth that it would be simpler to just purely ignore the xfail_checks in |
sklearn/utils/estimator_checks.py
Outdated
try: | ||
check(*args, **kwargs) | ||
except AssertionError: | ||
warnings.warn(reason, SkipTestWarning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it a bit strange to raise a SkipTestWarning when the test wasn't actually skipped: the check was run, but we got a failure as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we have a good warning for "expected to fail".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a fairly ambiguous "UserWarning".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! A couple of comments, where I agree is that it's definitely not the most straightforward part of the codebase. Maybe some type annotation couldn't have hurt here, just for redabilty, particularly for e.g. Estimator
input that's actually either a class or an instance. Not asking for this PR but it could have been nice.
sklearn/utils/estimator_checks.py
Outdated
# got a class | ||
return estimator.__name__ | ||
# got an instance | ||
return type(estimator).__name__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you re-use this function in _set_check_estimator_ids
where there is currently redundant code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_set_check_estimator_ids
gets the ID from an estimator using str
.
I removed this function to reduce the size of the diff, since it is not really needed anymore.
sklearn/utils/estimator_checks.py
Outdated
def wrapped(*args, **kwargs): | ||
try: | ||
check(*args, **kwargs) | ||
except AssertionError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
xfailed tests could fail with any exception, not just AssertionError
This will need an update considering that classes aren't supported anymore. That being said, I proposed an alternative in #17219 |
Reference Issues/PRs
Fixes #16958
What does this implement/fix? Explain your changes.
Allows
check_estimator
to warn according to_xfail_checks
tag._xfail_checks
and an assert is raise, then a warning will appear._xfail_checks
and NO assert is raised, then a warning is raised stating that the check can now be removed from_xfail_checks
.