Skip to content

Allow disassembled use of check_estimator #11622

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
azrdev opened this issue Jul 18, 2018 · 13 comments · Fixed by #14381
Closed

Allow disassembled use of check_estimator #11622

azrdev opened this issue Jul 18, 2018 · 13 comments · Fixed by #14381

Comments

@azrdev
Copy link

azrdev commented Jul 18, 2018

Description

For my downstream project, I'm testing my classifier with check_estimator, and would like to see which of its individual tests failed.
Doing so under nose works by using
for check in _yield_all_checks(name, estimator): yield check, name, my_estimator (and a separate call to the class-level checks), but py.test won't support yield-tests, and using _yield_all_checks to parameterize a test doesn't work because the function requires the estimator instance.

Therefore the request is to provide all checks from check_estimator as a (non-private) iterable, so they can be used separately.

I brought this up in #10728 first, errorneously.

Versions

Linux-4.17.5-1-ARCH-x86_64-with-arch-Arch-Linux
Python 3.6.6 (default, Jun 27 2018, 13:11:40)
[GCC 8.1.1 20180531]
NumPy 1.14.5
SciPy 1.1.0
Scikit-Learn 0.19.1

@rth
Copy link
Member

rth commented Jul 18, 2018

We could for instance add a parameter evaluate=True to check_estimator that would make it yield check, name, estimator. In which case, downstream projects would be able use it, for instance with pytest as follows,

from sklearn.utils.estimator_checks import check_estimator

@pytest.mark.parametrize('check, name, estimator',
                         check_estimator(SomeEstimatorClass, evaluate=False)
def test_skearn_compatible_estimator(check, name, estimator):
    check(name, estimator)

the limitation of this approach is that,

  • the name of the estimator is passed as parametrization parameter. For a single one, it's a bit useless, but this would make it easier to test multiples estimators in the downstream project at once by using e.g.,
itertools.chain_fromiterable(check_estimator(Estimator, evaluate=False)
                             for Estimator in [Estimator1, Estimator2, etc])
  • the repr of the estimator instance is a bit verbose and subopimal in the list of pytest generated test names. But this can be fixed by customizing pytest options.

Will make a PR to illustrate this better.

@massich
Copy link
Contributor

massich commented Jul 18, 2018

I'll give it a try

@jnothman
Copy link
Member

jnothman commented Jul 19, 2018 via email

@rth
Copy link
Member

rth commented Jul 20, 2018

Another way to potentially handle this is to provide, as an alternative to
check_estimator, a pytest fixture that generates the various checks, or
something similar... I've not thought this out in detail.​

Interesting idea. The problem is that checks are generated based on the estimator, so the fixture would need to know which third party estimators to test which might be difficult.

@azrdev
Copy link
Author

azrdev commented Jul 21, 2018

The problem is that checks are generated based on the estimator

what about separating this decision (does check apply to this estimator) from enumerating all checks? simplest way would be to move it into the check, but maybe there's an even better idea?

@rth
Copy link
Member

rth commented Jul 24, 2018

what about separating this decision (does check apply to this estimator) from enumerating all checks? simplest way would be to move it into the check, but maybe there's an even better idea?

Generally some of this was discussed in #6715 but it would require some significant refactoring of the way estimator checks works. Estimator tags (#8022) could also be part of the solution in the long term.

We can't just move the checks inside of the corresponding check_* functions (and skip if the estimator should not be run) because then we get an overly verbose list of tests that are harder to follow. E.g. if estimator check_classification_* is skipped for some estimator, currently it would typically mean that there is some issue why it was not supported as skipped as a workaround. If the estimator is not a classifier in the first place, the check will not be run. With the proposed change it would be run for all estimators, which would be confusing IMO.

@amueller
Copy link
Member

I thought I had commented on this. I'm not sure I'm convinced it's good to enable running individual tests because that will encourage not running all tests.

If your motivation is debugging, then really we should provide a better error message.

@rth
Copy link
Member

rth commented Jul 24, 2018

I thought I had commented on this. I'm not sure I'm convinced it's good to enable running individual tests because that will encourage not running all tests.

Thanks for the feedback. Still suppose one has a a single check that doesn't pass in check_estimator in a sklearn-contrib project. With an iterative setup, one could just skip the check in question and mark it as TODO later, and run the rest of checks. Without it, in the current situation, one would just not use check_estimator altogether because it would fail. I'm sure why the latter situation would be better.

It would give developers in scikit-learn contrib projects more flexibility if needed, without waiting for our 6month-12month release cycle. Flexibility is one of the things that was mentioned as limiting factor in #6715. Of course in the ideal world check_estimator would take into account all the needs of contrib projects, but currently we are not there yet.

Also honestly it's annoying to have one tests that runs several dozens checks. That's why we are parametrizing them in scikit-learn (before with yields now with pytest). Giving contrib projects a possibility to do the same, and trusting developers not to misuse it, would be nice IMO.

@rth
Copy link
Member

rth commented Jul 24, 2018

Also, some projects will never manage to be fully compliant with the scikit-learn API due to other constraints, e.g. hybrid recommendation, but at the same time, the API is somewhat inspired from scikit-learn, in which case one could imagine testing only the relevant subset of checks in check_estimator.

@amueller
Copy link
Member

@rth that could be done with data formats and possibly tasks / estimator types.
But you make a good point that more flexibility is probably better and we're not going to cover 100% of use cases with our first version of estimator tags.

@azrdev
Copy link
Author

azrdev commented Apr 17, 2019

@rth commented on 24 Jul 2018

We can't just move the checks inside of the corresponding check_* functions (and skip if the estimator should not be run) because then we get an overly verbose list of tests that are harder to follow. E.g. if estimator check_classification_* is skipped for some estimator, currently it would typically mean that there is some issue why it was not supported as skipped as a workaround.

I think pytest.mark.xfail(run=false) is there for exactly such "workaround" cases, to be distinguished from skip.

@amueller
Copy link
Member

amueller commented Jun 7, 2019

Some possibly useful pointers:

I think this might be a good entry point:
https://docs.pytest.org/en/latest/usage.html#calling-pytest-from-python-code

Here's how to use hooks:
https://docs.pytest.org/en/latest/example/simple.html#incremental-testing-test-steps

I think you want to work with the collection hooks?
https://docs.pytest.org/en/latest/reference.html#collection-hooks

@amueller
Copy link
Member

Never mind, I was overthinking it, let's go with #11622 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants