Skip to content

Add non-strict mode to check_estimator #13969

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
amueller opened this issue May 28, 2019 · 16 comments · Fixed by #17361
Closed

Add non-strict mode to check_estimator #13969

amueller opened this issue May 28, 2019 · 16 comments · Fixed by #17361
Labels
API Developer API Third party developer API related Documentation
Milestone

Comments

@amueller
Copy link
Member

Right now the common tests have two different use-cases: checking internal sklearn estimators, and checking 3rd party estimators via check_estimator.

The common tests are very strict to a degree that's not really reasonable for 3rd party estimators, in particular in checking for particular error messages.
Furthermore, the required error-messages are not documented anywhere (see also #10082).

I think it would be good to add a strict=False argument to check_estimator to not require exact messages, and possibly not even the exact error types. That would make it much easier for people to use check_estimator.

@rth
Copy link
Member

rth commented May 30, 2019

It's true that having a way to differentiate common tests used internally for scikit-learn and those intented to run on 3rd-party packages would be very useful. Related to #6715 . Maybe using something like,

if estimator.__module__.startswith('sklearn.'):
    # run some scikit-learn specific check

in common tests would be better? For the examples you provide they should never apply outside of scikit-learn and this would avoid passing this parameter everywhere..

@amueller
Copy link
Member Author

I would rather give the user the control rather than being magic about it. It would add the parameter in a lot of places.
On the other hand, we would have to add your snippet in a lot of places as well (though not in the signatures).
Maybe someone wants to have error messages consistent with sklearn?
I'm not entirely decided but right now tending towards a parameter.

@jnothman
Copy link
Member

jnothman commented May 31, 2019 via email

@thomasjpfan
Copy link
Member

I think not having the exact error message is good. Now...are we okay with not even checking the types when strict=False?

@amueller
Copy link
Member Author

amueller commented May 31, 2019

Or strict could be 'message', 'type' or False? Not sure if that's necessary though.

@adrinjalali
Copy link
Member

Since this seems to be the place, do we want to have a deprecation cycle for certain API changes in check_estomator? Seems reasonable to me. Certain checks can then move from non-strict to strict with a deprecation cycle (related: #14241 (comment))

@hirenmayani
Copy link

I'm working on this. NYC WiMLDS

@amueller
Copy link
Member Author

@hirenmayani I would recommend you start with something smaller if you haven't worked on sklearn before, you can also pick up something more tricky later.

@hirenmayani
Copy link

I think I can fix it. Will ask you if I need any help.

@amueller
Copy link
Member Author

@hirenmayani ok go for it then.

@rth
Copy link
Member

rth commented Aug 6, 2020

Re-opening since we still need to make another PR to use the mechanism added in #17361 in more common tests. Also add a what's new entry.

@rth rth reopened this Aug 6, 2020
@NicolasHug
Copy link
Member

Not validating error messages is done in #18415.

What else do we want to relax when strict mode is off?

@adriangb
Copy link
Contributor

From slack conversation, it would be nice for strict=False to relax this check:

assert y_pred.dtype == np.dtype('float64'), (

To just:

assert y_pred.dtype.kind == "f"  # any precision floating-point

@adrinjalali
Copy link
Member

Similar to #16241, moving to 1.1

@adrinjalali adrinjalali modified the milestones: 1.0, 1.1 Aug 22, 2021
@jeremiedbb
Copy link
Member

very unlikely to be part of 1.1, moving to 1.2

@adrinjalali
Copy link
Member

I think now that we have #29699 merged and released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment