-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[WIP] Verbose flag displaying progress bar for check_estimator in sklearn.utils.estimator_checks #13843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Verbose flag displaying progress bar for check_estimator in sklearn.utils.estimator_checks #13843
Conversation
Created 'verbose' parameter in check_estimator function from sklearn.utils.estimator_checks with default value False.
Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Please pursue step 2; step 1 is unnecessary, and we've correspondingly closed #13748 |
Thanks @jnothman - do you have any pointers on how I could get started on step 2 ? Or any pytest resources that could point me in the right direction to generate the tests for the collector ? |
Sorry guys - is there any resource you could point me to to create tests for the pystest collector ? |
I think this might be a good entry point: Here's how to use hooks: I think you want to work with the collection hooks? |
Thanks @amueller ! I will look into it |
Pytest runner added inside the check_estimator function, still need to generate tests for the collector. On branch check_estimator_progress_verbose_mode Your branch is up to date with 'origin/check_estimator_progress_verbose_mode'. Changes to be committed: modified: estimator_checks.py
Merge remote-tracking branch 'upstream/master' into check_estimator_progress_verbose_mode.
Merge branch 'check_estimator_progress_verbose' of github.com:scouvreur/scikit-learn into check_estimator_progress_verbose_mode.
I wouldn't think we want to make use of I think we already have a usable solution internal to scikit-learn here: scikit-learn/sklearn/tests/test_common.py Lines 84 to 113 in 197f448
but at the moment it relies on the internal @pytest.mark.estimator_checks(_tested_estimators())
def test_estimators(estimator, check):
# Common tests for estimator instances
with ignore_warnings(category=(DeprecationWarning, ConvergenceWarning,
UserWarning, FutureWarning)):
set_checking_parameters(estimator)
name = estimator.__class__.__name__
check(name, estimator) or something even more succinct, such as not needing the This assumes that it's fine to construct the estimator and identify which checks it qualifies at collection time (we currently do this in scikit-learn). This may not be efficient. To avoid constructing the estimator at collection time, you'd need to generate all the checks and skip those combinations that are inapplicable at test (not collection) time. Here one option would be to create an I'm happy with the former approach - a public alternative to our current use of |
Thanks for the feedback @jnothman - I will work on that and update you with progress ! |
Changed _yield_all_checks to a public method and changed references to its private version.
@jnothman I guess I wanted it to be a function, your solution would require the users to write a test, right? Which might be more sensible but not be a direct replacement for |
check_estimator needs to be called in a test for library developers already, so I don't see what's changed. We could add magic like
but I don't really see the benefit. |
If you really want to follow the approach I suggested above (not sure if it's the right thing to do), you will need to implement something much like https://github.com/pytest-dev/pytest/blob/2c402f4bd9cff2c6faeccb86a97364e1fa122a16/src/_pytest/python.py#L119-L128 |
but actually that is probably a nuisance since scikit-learn would have to provide a pytest plugin, which seems a bit silly? Maybe better off just having users do:
|
The limitation of this is that checks are estimator dependent, and having lots of skipped tests that will never run is also not ideal. The solution proposed in #13843 (comment) sounds reasonable. It's fairly close to the one proposed in #11622 (comment) ALL_ESTIMATORS = [Estimator1, Estimator2, etc]
@pytest.mark.parametrize(
'check, name, estimator',
itertools.chain.from_iterable(
check_estimator(Estimator, evaluate=False)
for Estimator in ALL_ESTIMATORS
)
) (maybe with an additional helper function), where |
Right. Thanks for the reminder that we are trying to not overwhelm it with skips. |
Fair, let's go with something like that. Seems pretty straight-forward, right? |
I think we should have generate_only=True rather than evaluate=False, but yes, it's a good solution as far as I'm concerned. |
…rogress_verbose_mode
…rogress_verbose_mode
Added bool flag for check_estimators() and updated pytest decorator for test_estimators(). Changes to be committed: modified: tests/test_common.py modified: utils/estimator_checks.py
@scouvreur I think we're going with #14381 right now. Not sure if there has been some miscommunication anywhere? cc @thomasjpfan |
Yes, it seems we are leaning toward #14381. Thank you for working on this issue @scouvreur ! |
Ah I see @amueller - should I close this PR then ? |
It's been closed. Thanks a lot for helping us nut this out.
|
Reference Issues/PRs
Enhancement proposal suggested by @cod3licious in issue #13748. Also related to issue #11622.
What does this implement/fix? Explain your changes.
As suggested by @jnothman :
create 'verbose' parameter in check_estimator function with default value False