-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[WIP] Prototype 2 for strict check_estimator mode #16890
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Prototype 2 for strict check_estimator mode #16890
Conversation
Souunds like this doesn't play well with tests marked with a I honestly don't understand how that works, help welcome @thomasjpfan |
sklearn/utils/estimator_checks.py
Outdated
if strict_mode: | ||
return check(*args, **kwargs) | ||
else: | ||
raise SkipTest(f"Skipping {check} since strict mode is off") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the issue may be that this decorator shouldn't raise skip test (which might not be handled well in a pytest session), but rather only mark the test as skipped in some way. And we can skip it in the same place we currently mark tests as xfail.
I'll try to have a more detailed look in a couple of days..
The issue comes from So the fix is to change the tag value, for every single estimator. Or is there a way we can change |
The following is one way help with the keyword issue: --- a/sklearn/utils/estimator_checks.py
+++ b/sklearn/utils/estimator_checks.py
@@ -290,7 +290,7 @@ def _yield_all_checks(name, estimator):
yield check_fit_non_negative
-def _set_check_estimator_ids(obj):
+def _set_check_estimator_ids(obj, include_keywords=True):
"""Create pytest ids for checks.
When `obj` is an estimator, this returns the pprint version of the
@@ -306,6 +306,9 @@ def _set_check_estimator_ids(obj):
obj : estimator or function
Items generated by `check_estimator`
+ include_keywords : bool, default=True
+ Includes keywords in id
+
Returns
-------
id : string or None
@@ -318,7 +321,7 @@ def _set_check_estimator_ids(obj):
if not isinstance(obj, partial):
return obj.__name__
- if not obj.keywords:
+ if not obj.keywords or not include_keywords:
return obj.func.__name__
kwstring = ",".join(["{}={}".format(k, v)
@@ -378,7 +381,7 @@ def _mark_xfail_checks(estimator, check, pytest):
if not xfail_checks:
return estimator, check
- check_name = _set_check_estimator_ids(check)
+ check_name = _set_check_estimator_ids(check, include_keywords=False)
msg = xfail_checks.get(check_name, None)
if msg is None: |
…rict_mode_function_level
Thanks for the suggestion @thomasjpfan . This seems to have fixed the previous issue, but I'm concerned that this patch will prevent the selection of xfail tests based on a parameter (especially the def _more_tags(self):
return {
'_xfail_test': {
'some_check(strict_mode=true)':
'the strict case fails but the non-strict one should be just fine'
}
} |
Ah yes, then we would need to update |
And @rth @thomasjpfan, would it be better for the xfail tag logic to rely on partial functions instead of string representations? I.e. we would specify partial application of a check in the tag, instead of its test id. |
Thanks for working on this @NicolasHug . Sorry for slow review.
Could you give an example? Do you mean, e.g. '_xfail_test': {
partial(check_methods_subset_invariance, some_param=3),
'fails for the predict method'
} ? >>> def f(x, a=2):
... return x + 1
...
>>> from functools import partial
>>> partial(f, a=3)
functools.partial(<function f at 0x7f2366b2bdc0>, a=3)
>>> hash(partial(f, a=3))
8736876476520 I still find it a bit convoluted to use it that, particularly if check_.. can one day become decorated by pytest with some marks (and then I'm not sure if there is case where it's not hashable). But as a tuple (func, message) why not, though we would need to iterate though all of the items manually then. maybe comparison would be more reliable indeed, and it would ensure that the skipped check does indeed exist. So overall I have mixed feeling about it, possibly with +0.5. Not sure if it helps :) |
Just read the previous conversation in more details. If strict check is unpleasant to make work with xfail_checks as strings, then +1 to for partial functions. |
BTW as @adrinjalali noted we should aim for a feature freeze soon. Considering that i) the solution for the strict mode will likely break backward compat of the xfail logic and that ii) we won't reach a solution within the timeframe, we should probably make very clear in the UG that the |
…rict_mode_function_level
You think we won't? Disabling stict checks is likely the no 1 ask from maintainers of contrib projects. If you need help on this I'd be happy to help. Do we have a date for the feature freeze? |
Thanks for the help! I'll try to make it work with partial functions and will report back. If it's too tricky I guess relying on the generated check id would be fine. Feature freeze is planned for the end of next week apparently. |
Yes, saw the message on gitter. Will respond there.
I know, but at least we'll have the basic mechanism. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this much better than using an estimator tag to indicate strict mode?
You mean an additional tag on top of With #16958, I'm starting to believe we need to rethink the |
I suppose on top of xfail_checks, yes. strict is there to delineate design
standards expected in scikit-learn from minimal expectations for any
estimator. xfail_checks is intended for estimators that are necessarily
unconventional in some way.
|
Could you please help me understand with an example of how the tag would be used? Would it be something like def _more_tags(self):
return {
'dont_execute_in_strict_mode': [check_1, check_2, ...]
} ? In any case, note that the changes proposed so far in this PR are orthogonal to the addition of a new tag. Also, question for you: do you think we should allow checks to be partially strict? I.e. allow a check to have a strict section, and a non-strict section? |
If that's the case for a test, should they not be two tests? This also fixes #13969, doesn't it? Putting this in 0.24 (although I'd be happy to have this in a minor release as well). |
I think we should not release |
I personally tend to agree with you. For context, the reason I'm asking is because if we don't need to have partially strict checks, then #16882 is enough. @rth seems to think it's worth having #16882 (comment). But I'd like to hear @jnothman's opinion too. |
I personally would be much happier with this solution, especially having #16756 in mind. |
Sorry, what do you mean by "this" solution? |
I agree we should avoid over-engineering this. We can apply strict at the check level, but this means e.g. that we cant assert error message in common tests. Basically all
Just a reminder that "The estimator tags are experimental and the API is subject to change.". Currently |
Just to be clear, I'm not opposed to that solution, as long as we break checks in smaller pieces when needed. Technically they might be semi-public API though (or at least have been around enough time for contrib projects to build testing setups around them e.g. https://github.com/rtavenar/tslearn/blob/f87744a4d62578f4c82aead6f3ee4e9d51fd2a54/tslearn/tests/test_estimators.py#L40) |
Sorry, I meant as opposed to #16882 |
I just meant:
Then all assertions which are for a standard we hold in sklearn but not required for general compatibility would skip if 'strict' is not. |
In what case? |
in the case where we know an estimator doesn't pass the strict suite, but we still want to know which of the strict checks fail. |
Also note that changing the tag of an instance is basically unsupported right now. We only support setting tags on classes, as far as I can tell. Having strict_mode as a tag would thus prevent us from using the strict suite with |
Okay. I'm alright with strict not being a tag
Untrue.
|
I meant setting tags on the fly on an instance i.e. |
Closes #14929
Fixes #13969
Fixes #16241
Alternative to #16882 and #17252
Here we require every single check to have a
strict_mode
parameter, even if it's not usedping @rth :)