ENH Adds support to xfail in check_estimator. #16963

thomasjpfan · 2020-04-19T18:21:52Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Allows check_estimator to warn according to _xfail_checks tag.

When a check is in _xfail_checks and an assert is raise, then a warning will appear.
When a check is in _xfail_checks and NO assert is raised, then a warning is raised stating that the check can now be removed from _xfail_checks.

rth · 2020-04-19T18:51:42Z

I'm just concerned about complexity of that test logic. IMO if the test is marked as xfail it should be skipped in check_estimator (possibly with a warning) and that's it.

If users want nuanced xfail behavior they should use a proper testing framework i.e. pytest with the parametrize_with_checks decorators (and we should encourage that). We shouldn't have re-implement the detailed xfail logic ourselves. When @NicolasHug tried to change that code he already found it somewhat complex, and I feel like this would add more complexity.

Sorry I should have commented on the issue earlier.

rth · 2020-04-19T18:58:07Z

Since there is indeed a bug, I would just propose to bypass the test with possibly a warning that it was skipped. That's what pytest.xfail does by default.

We are using more advanced pytest features to run the tests and check the output status as XPASS/XFAIL, but we shouldn't need to re-implement it for check_estimator IMO.

Or anything else that allows fixing the linked issue without adding too much complexity/code.

NicolasHug · 2020-04-19T20:21:52Z

+1 to keep the logic simple here and just not run the checks marked as xfail.

…estimator

thomasjpfan · 2020-04-20T17:41:40Z

Updated PR with (hopefully) simpler logic.

NicolasHug · 2020-04-20T17:52:42Z

sklearn/utils/tests/test_estimator_checks.py

+    # When a _xfail_checks check passes, raise an error stating that the test
+    # passes and can be removed from the tag


My understanding was that we dont't care about that scenario because we don't want to reimplement pytest's logic?

I changed it to issuing a warning.

From an implementation point of view, it only takes 2 lines to implement this logic in _check_warns_on_fail. I would not want to keep silent about this, since a developer would not know if a xfail check is now passing.

doc/developers/develop.rst

NicolasHug · 2020-04-24T15:07:34Z

sklearn/utils/estimator_checks.py

+    if not hasattr(estimator, "_get_tags"):
+        return {}


why is this needed? estimators inheriting from BaseEstimator should have a _get_tags method

The following

scikit-learn/sklearn/utils/tests/test_estimator_checks.py

Line 365 in 2587a03

assert_raises_regex(TypeError, msg, check_estimator, object)

would fail at the incorrect place.

I think we should just change the test then. Which is what I did for the one just below. The previous error message wasn't that great either anyway

NicolasHug · 2020-04-24T15:10:20Z

sklearn/utils/estimator_checks.py

+    """Get xfail_check from estimator"""
+    if isinstance(estimator, type):
+        # try to construct estimator instance, if it is unable to then
+        # return then ignore xfail tag


Suggested change

# return then ignore xfail tag

# ignore xfail tag

NicolasHug · 2020-04-24T15:11:12Z

sklearn/utils/estimator_checks.py

@@ -331,6 +332,44 @@ def _construct_instance(Estimator):
    return estimator


+def _get_xfail_checks(estimator):
+    """Get xfail_check from estimator"""


Suggested change

"""Get xfail_check from estimator"""

"""Get checks marked with xfail_checks tag from estimator"""

NicolasHug · 2020-04-24T15:12:55Z

sklearn/utils/estimator_checks.py

    """Mark (estimator, check) pairs with xfail according to the
    _xfail_checks_ tag"""


This needs an update now

NicolasHug · 2020-04-24T15:22:45Z

sklearn/utils/estimator_checks.py

+    return estimator._get_tags()['_xfail_checks'] or {}
+
+
+def _check_warns_on_fail(estimator, check, xfail_checks_tag):


Do we really need to pass the estimator? It looks like the logic only affects the check.

It was not needed.

To address https://github.com/scikit-learn/scikit-learn/pull/16963/files#r414661838 it would be needed now.

NicolasHug · 2020-04-24T15:23:14Z

sklearn/utils/estimator_checks.py

+    return estimator._get_tags()['_xfail_checks'] or {}
+
+
+def _check_warns_on_fail(estimator, check, xfail_checks_tag):


call this _make_check_warn_on_fail?

NicolasHug · 2020-04-24T15:24:30Z

sklearn/utils/estimator_checks.py

+        warnings.warn(f"{check_name} passed, it can be removed "
+                      f"from the _xfail_check tag", SkipTestWarning)


should we also print the estimator name?

NicolasHug · 2020-04-24T15:30:49Z

sklearn/utils/estimator_checks.py

+        # did not fail
+        check_name = _set_check_estimator_ids(check)
+        warnings.warn(f"{check_name} passed, it can be removed "
+                      f"from the _xfail_check tag", SkipTestWarning)


Should this be a FutureWarning?

NicolasHug · 2020-04-24T15:33:25Z

sklearn/utils/estimator_checks.py

+
+
+def _check_warns_on_fail(estimator, check, xfail_checks_tag):
+    """Convert assertion errors to warnings if reason is given"""


We can probably provide more details here:

Suggested change

"""Convert assertion errors to warnings if reason is given"""

"""

Wrap the check so that a warning is raised:

- when the check is in the `xfail_checks` tag and the check properly failed as expected

- when the check is in the `xfail_checks` tag and the check didnt fail but should have

Checks that aren't in the xfail_checks tag aren't wrapped and are returned as-is.

This wrapper basically simulates what pytest would do with the @xfail decorator, but this one can be used with check_estimator() which doesn't rely on pytest.

"""

…estimator

NicolasHug · 2020-04-24T17:19:49Z

sklearn/utils/estimator_checks.py

+        return estimator, check
+
+    reason = xfail_checks_tag[check_name]
+    name = _get_estimator_name(estimator)


Suggested change

name = _get_estimator_name(estimator)

est_name = _get_estimator_name(estimator)

sklearn/utils/estimator_checks.py

NicolasHug · 2020-04-24T17:21:08Z

sklearn/utils/estimator_checks.py

@@ -333,42 +334,100 @@ def _construct_instance(Estimator):
    return estimator


+def _get_estimator_name(estimater):


Suggested change

def _get_estimator_name(estimater):

def _get_estimator_name(estimator):

NicolasHug · 2020-04-24T17:21:41Z

sklearn/utils/estimator_checks.py

+        # try to construct estimator instance, if it is unable to then
+        # return xfail tag


Suggested change

# try to construct estimator instance, if it is unable to then

# return xfail tag

# try to construct estimator instance, if it is unable to then

# xfail_checks tag is ignored```

NicolasHug · 2020-04-24T17:24:45Z

sklearn/utils/tests/test_estimator_checks.py

+    # skips check_estimators_fit_returns_self based on _xfail_checks
+
+    assert_warns_message(SkipTestWarning, "This is a bad check",
+                         check_estimator, LRXDoesNotRaise)


pass an instance?

NicolasHug · 2020-04-24T17:24:54Z

sklearn/utils/tests/test_estimator_checks.py

+    assert_warns_message(FutureWarning,
+                         "LRXFailTags:check_complex_data passed, it can "
+                         "be removed from the _xfail_check tag",
+                         check_estimator, LRXFailTags)


pass an instance?

sklearn/utils/tests/test_estimator_checks.py

NicolasHug · 2020-04-24T17:26:44Z

sklearn/utils/tests/test_estimator_checks.py

+    def fit(self, X, y):
+        # do not raise error for complex check
+        try:
+            return super().fit(X, y)
+        except ValueError as e:
+            if "Complex data not supported" not in str(e):
+                raise
+
+        return self


Sorry I don't understand what this class and the check below are supposed to test

Also what's the relation with check_estimators_fit_returns_self?

The comment needed to be updated to:

# skips check_complex_data based on _xfail_checks

NicolasHug · 2020-04-24T17:27:19Z

sklearn/utils/tests/test_estimator_checks.py

@@ -592,6 +594,42 @@ def __init__(self, special_parameter):
                                  check_estimator, MyEstimator)


+class LRXFailTags(LogisticRegression):


Should this be called LRXFailTagButCheckPasses

NicolasHug · 2020-04-24T18:42:21Z

sklearn/utils/estimator_checks.py

+
+    for estimator, check in checks_generator:
+        check_name = _set_check_estimator_ids(check)
+        if check_name in xfail_checks_tag:


Any reason to remove the previous comments? These can't hurt IMO

NicolasHug · 2020-04-24T18:46:57Z

sklearn/utils/tests/test_estimator_checks.py

+        return self
+
+
+def test_check_estimator_xfail_tag_skips():


My understanding is that check_estimator does not skip anything?

This checks that it raises a "SkipTestWarning". Renamed to test_check_estimator_xfail_tag_raises_skip_test_warning.

NicolasHug · 2020-04-24T18:48:17Z

sklearn/utils/tests/test_estimator_checks.py

+
+
+def test_check_estimator_xfail_tag_skips():
+    # skips check_complex_data based on _xfail_checks


Can't we just check on an estimator with a properly set xfail_tags instead, like the Dummy one?

I don't understand why we need to create a new estimator here

I was thinking that if the estimator gets fixed and the xfail_tags gets removed this test would start failing.

Changed to use Dummy.

NicolasHug · 2020-04-26T13:24:19Z

I think I'd be OK with the proposed changes but I still agree with @rth that it would be simpler to just purely ignore the xfail_checks in check_estimator.

NicolasHug · 2020-04-26T15:12:15Z

sklearn/utils/estimator_checks.py

+        try:
+            check(*args, **kwargs)
+        except AssertionError:
+            warnings.warn(reason, SkipTestWarning)


I find it a bit strange to raise a SkipTestWarning when the test wasn't actually skipped: the check was run, but we got a failure as expected.

I do not think we have a good warning for "expected to fail".

Maybe a fairly ambiguous "UserWarning".

rth

Thanks! A couple of comments, where I agree is that it's definitely not the most straightforward part of the codebase. Maybe some type annotation couldn't have hurt here, just for redabilty, particularly for e.g. Estimator input that's actually either a class or an instance. Not asking for this PR but it could have been nice.

rth · 2020-04-26T15:47:51Z

sklearn/utils/estimator_checks.py

+        # got a class
+        return estimator.__name__
+    # got an instance
+    return type(estimator).__name__


Could you re-use this function in _set_check_estimator_ids where there is currently redundant code?

_set_check_estimator_ids gets the ID from an estimator using str.

I removed this function to reduce the size of the diff, since it is not really needed anymore.

rth · 2020-04-26T15:49:43Z

sklearn/utils/estimator_checks.py

+    def wrapped(*args, **kwargs):
+        try:
+            check(*args, **kwargs)
+        except AssertionError:


xfailed tests could fail with any exception, not just AssertionError

NicolasHug · 2020-05-14T13:54:40Z

This will need an update considering that classes aren't supported anymore.

That being said, I proposed an alternative in #17219

…estimator

thomasjpfan added 2 commits April 19, 2020 13:51

ENH Adds xfail to check_estimator

b6151ef

BUG Fix

7a30e66

github-actions bot added the module:utils label Apr 19, 2020

NicolasHug mentioned this pull request Apr 20, 2020

DOC details on the use of xfail_checks #16968

Merged

thomasjpfan added 4 commits April 20, 2020 11:40

Merge remote-tracking branch 'upstream/master' into xfail_test_check_…

d1c9f23

…estimator

CLN Simplify code

214c492

REV Less diffs

9ac1144

STY Flake

ec16dd8

NicolasHug reviewed Apr 20, 2020

View reviewed changes

DOC States that check warns

2815079

NicolasHug reviewed Apr 24, 2020

View reviewed changes

thomasjpfan added 3 commits April 24, 2020 11:54

Merge remote-tracking branch 'upstream/master' into xfail_test_check_…

6f28980

…estimator

CLN Address comments

8eac5e3

CLN Address comments

5a373cb

NicolasHug reviewed Apr 24, 2020

View reviewed changes

CLN Address comments

71bc92b

NicolasHug reviewed Apr 24, 2020

View reviewed changes

STY Fix

99d6f24

NicolasHug reviewed Apr 26, 2020

View reviewed changes

thomasjpfan added 3 commits April 26, 2020 11:24

CLN Address comments

2e887f2

FIX

2544c57

CLN Remove xfailness

64a303f

rth reviewed Apr 26, 2020

View reviewed changes

CLN Address comments

49516ca

NicolasHug mentioned this pull request May 6, 2020

[MRG] Remove class support check estimator and parametrize_with_checks #17134

Merged

rth mentioned this pull request May 10, 2020

Common check for sample weight invariance with removed samples #17176

Merged

NicolasHug mentioned this pull request May 14, 2020

[MRG] MNT Ignore xfail_checks in check_estimator #17219

Closed

thomasjpfan added 4 commits May 14, 2020 10:43

Merge remote-tracking branch 'upstream/master' into xfail_test_check_…

c3d3ce9

…estimator

BUG Fix

f4435ab

DOC Update docs

a42622d

CLN A little nicer

34afb6c

NicolasHug mentioned this pull request May 14, 2020

MNT Ignore xfail_checks tag in check_estimator, with warning #17222

Merged

NicolasHug closed this in #17222 May 15, 2020

		# When a _xfail_checks check passes, raise an error stating that the test
		# passes and can be removed from the tag

	"""Get xfail_check from estimator"""
	"""Get checks marked with xfail_checks tag from estimator"""

		"""Mark (estimator, check) pairs with xfail according to the
		_xfail_checks_ tag"""

		return estimator._get_tags()['_xfail_checks'] or {}


		def _check_warns_on_fail(estimator, check, xfail_checks_tag):

		warnings.warn(f"{check_name} passed, it can be removed "
		f"from the _xfail_check tag", SkipTestWarning)



		def _check_warns_on_fail(estimator, check, xfail_checks_tag):
		"""Convert assertion errors to warnings if reason is given"""

-    """Convert assertion errors to warnings if reason is given"""
+	"""
+    Wrap the check so that a warning is raised:
+		- when the check is in the `xfail_checks` tag and the check properly failed as expected
+		- when the check is in the `xfail_checks` tag and the check didnt fail but should have
+	Checks that aren't in the xfail_checks tag aren't wrapped and are returned as-is.
+	This wrapper basically simulates what pytest would do with the @xfail decorator, but this one can be used with check_estimator() which doesn't rely on pytest.
+	"""

	name = _get_estimator_name(estimator)
	est_name = _get_estimator_name(estimator)

		@@ -333,42 +334,100 @@ def _construct_instance(Estimator):
		return estimator


		def _get_estimator_name(estimater):

	def _get_estimator_name(estimater):
	def _get_estimator_name(estimator):

		# try to construct estimator instance, if it is unable to then
		# return xfail tag

		@@ -592,6 +594,42 @@ def __init__(self, special_parameter):
		check_estimator, MyEstimator)


		class LRXFailTags(LogisticRegression):



		def test_check_estimator_xfail_tag_skips():
		# skips check_complex_data based on _xfail_checks

ENH Adds support to xfail in check_estimator. #16963

ENH Adds support to xfail in check_estimator. #16963

Conversation

thomasjpfan commented Apr 19, 2020 • edited Loading

Reference Issues/PRs

What does this implement/fix? Explain your changes.

rth commented Apr 19, 2020

rth commented Apr 19, 2020

NicolasHug commented Apr 19, 2020

thomasjpfan commented Apr 20, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thomasjpfan Apr 26, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thomasjpfan Apr 26, 2020 • edited Loading

Choose a reason for hiding this comment

NicolasHug commented Apr 26, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rth left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NicolasHug commented May 14, 2020

thomasjpfan commented Apr 19, 2020 •

edited

Loading

thomasjpfan Apr 26, 2020 •

edited

Loading

thomasjpfan Apr 26, 2020 •

edited

Loading