Skip to content

ENH Adds support to xfail in check_estimator. #16963

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

thomasjpfan
Copy link
Member

@thomasjpfan thomasjpfan commented Apr 19, 2020

Reference Issues/PRs

Fixes #16958

What does this implement/fix? Explain your changes.

Allows check_estimator to warn according to _xfail_checks tag.

  1. When a check is in _xfail_checks and an assert is raise, then a warning will appear.
  2. When a check is in _xfail_checks and NO assert is raised, then a warning is raised stating that the check can now be removed from _xfail_checks.

@rth
Copy link
Member

rth commented Apr 19, 2020

I'm just concerned about complexity of that test logic. IMO if the test is marked as xfail it should be skipped in check_estimator (possibly with a warning) and that's it.

If users want nuanced xfail behavior they should use a proper testing framework i.e. pytest with the parametrize_with_checks decorators (and we should encourage that). We shouldn't have re-implement the detailed xfail logic ourselves. When @NicolasHug tried to change that code he already found it somewhat complex, and I feel like this would add more complexity.

Sorry I should have commented on the issue earlier.

@rth
Copy link
Member

rth commented Apr 19, 2020

Since there is indeed a bug, I would just propose to bypass the test with possibly a warning that it was skipped. That's what pytest.xfail does by default.

We are using more advanced pytest features to run the tests and check the output status as XPASS/XFAIL, but we shouldn't need to re-implement it for check_estimator IMO.

Or anything else that allows fixing the linked issue without adding too much complexity/code.

@NicolasHug
Copy link
Member

+1 to keep the logic simple here and just not run the checks marked as xfail.

@thomasjpfan
Copy link
Member Author

Updated PR with (hopefully) simpler logic.

Comment on lines 605 to 606
# When a _xfail_checks check passes, raise an error stating that the test
# passes and can be removed from the tag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding was that we dont't care about that scenario because we don't want to reimplement pytest's logic?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to issuing a warning.

From an implementation point of view, it only takes 2 lines to implement this logic in _check_warns_on_fail. I would not want to keep silent about this, since a developer would not know if a xfail check is now passing.

Comment on lines 345 to 346
if not hasattr(estimator, "_get_tags"):
return {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed? estimators inheriting from BaseEstimator should have a _get_tags method

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following

assert_raises_regex(TypeError, msg, check_estimator, object)

would fail at the incorrect place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just change the test then. Which is what I did for the one just below. The previous error message wasn't that great either anyway

"""Get xfail_check from estimator"""
if isinstance(estimator, type):
# try to construct estimator instance, if it is unable to then
# return then ignore xfail tag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# return then ignore xfail tag
# ignore xfail tag

@@ -331,6 +332,44 @@ def _construct_instance(Estimator):
return estimator


def _get_xfail_checks(estimator):
"""Get xfail_check from estimator"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Get xfail_check from estimator"""
"""Get checks marked with xfail_checks tag from estimator"""

Comment on lines 401 to 402
"""Mark (estimator, check) pairs with xfail according to the
_xfail_checks_ tag"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs an update now

return estimator._get_tags()['_xfail_checks'] or {}


def _check_warns_on_fail(estimator, check, xfail_checks_tag):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to pass the estimator? It looks like the logic only affects the check.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was not needed.

To address https://github.com/scikit-learn/scikit-learn/pull/16963/files#r414661838 it would be needed now.

return estimator._get_tags()['_xfail_checks'] or {}


def _check_warns_on_fail(estimator, check, xfail_checks_tag):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call this _make_check_warn_on_fail?

Comment on lines 368 to 369
warnings.warn(f"{check_name} passed, it can be removed "
f"from the _xfail_check tag", SkipTestWarning)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also print the estimator name?

# did not fail
check_name = _set_check_estimator_ids(check)
warnings.warn(f"{check_name} passed, it can be removed "
f"from the _xfail_check tag", SkipTestWarning)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a FutureWarning?



def _check_warns_on_fail(estimator, check, xfail_checks_tag):
"""Convert assertion errors to warnings if reason is given"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably provide more details here:

Suggested change
"""Convert assertion errors to warnings if reason is given"""
"""
Wrap the check so that a warning is raised:
- when the check is in the `xfail_checks` tag and the check properly failed as expected
- when the check is in the `xfail_checks` tag and the check didnt fail but should have
Checks that aren't in the xfail_checks tag aren't wrapped and are returned as-is.
This wrapper basically simulates what pytest would do with the @xfail decorator, but this one can be used with check_estimator() which doesn't rely on pytest.
"""

return estimator, check

reason = xfail_checks_tag[check_name]
name = _get_estimator_name(estimator)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
name = _get_estimator_name(estimator)
est_name = _get_estimator_name(estimator)

@@ -333,42 +334,100 @@ def _construct_instance(Estimator):
return estimator


def _get_estimator_name(estimater):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def _get_estimator_name(estimater):
def _get_estimator_name(estimator):

Comment on lines 349 to 350
# try to construct estimator instance, if it is unable to then
# return xfail tag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# try to construct estimator instance, if it is unable to then
# return xfail tag
# try to construct estimator instance, if it is unable to then
# xfail_checks tag is ignored```

# skips check_estimators_fit_returns_self based on _xfail_checks

assert_warns_message(SkipTestWarning, "This is a bad check",
check_estimator, LRXDoesNotRaise)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass an instance?

assert_warns_message(FutureWarning,
"LRXFailTags:check_complex_data passed, it can "
"be removed from the _xfail_check tag",
check_estimator, LRXFailTags)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass an instance?

Comment on lines 615 to 623
def fit(self, X, y):
# do not raise error for complex check
try:
return super().fit(X, y)
except ValueError as e:
if "Complex data not supported" not in str(e):
raise

return self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I don't understand what this class and the check below are supposed to test

Also what's the relation with check_estimators_fit_returns_self?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment needed to be updated to:

# skips check_complex_data based on _xfail_checks

@@ -592,6 +594,42 @@ def __init__(self, special_parameter):
check_estimator, MyEstimator)


class LRXFailTags(LogisticRegression):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be called LRXFailTagButCheckPasses


for estimator, check in checks_generator:
check_name = _set_check_estimator_ids(check)
if check_name in xfail_checks_tag:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to remove the previous comments? These can't hurt IMO

return self


def test_check_estimator_xfail_tag_skips():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that check_estimator does not skip anything?

Copy link
Member Author

@thomasjpfan thomasjpfan Apr 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This checks that it raises a "SkipTestWarning". Renamed to test_check_estimator_xfail_tag_raises_skip_test_warning.



def test_check_estimator_xfail_tag_skips():
# skips check_complex_data based on _xfail_checks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just check on an estimator with a properly set xfail_tags instead, like the Dummy one?

I don't understand why we need to create a new estimator here

Copy link
Member Author

@thomasjpfan thomasjpfan Apr 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that if the estimator gets fixed and the xfail_tags gets removed this test would start failing.

Changed to use Dummy.

@NicolasHug
Copy link
Member

I think I'd be OK with the proposed changes but I still agree with @rth that it would be simpler to just purely ignore the xfail_checks in check_estimator.

try:
check(*args, **kwargs)
except AssertionError:
warnings.warn(reason, SkipTestWarning)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it a bit strange to raise a SkipTestWarning when the test wasn't actually skipped: the check was run, but we got a failure as expected.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we have a good warning for "expected to fail".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a fairly ambiguous "UserWarning".

Copy link
Member

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! A couple of comments, where I agree is that it's definitely not the most straightforward part of the codebase. Maybe some type annotation couldn't have hurt here, just for redabilty, particularly for e.g. Estimator input that's actually either a class or an instance. Not asking for this PR but it could have been nice.

# got a class
return estimator.__name__
# got an instance
return type(estimator).__name__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you re-use this function in _set_check_estimator_ids where there is currently redundant code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_set_check_estimator_ids gets the ID from an estimator using str.

I removed this function to reduce the size of the diff, since it is not really needed anymore.

def wrapped(*args, **kwargs):
try:
check(*args, **kwargs)
except AssertionError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xfailed tests could fail with any exception, not just AssertionError

@NicolasHug
Copy link
Member

This will need an update considering that classes aren't supported anymore.

That being said, I proposed an alternative in #17219

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

xfail_checks tag only works with parametrize_with_checks
3 participants