Skip to content

Invariance testing for partial_fit #3896

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jnothman opened this issue Nov 28, 2014 · 6 comments
Closed

Invariance testing for partial_fit #3896

jnothman opened this issue Nov 28, 2014 · 6 comments
Labels
Hard Hard level of difficulty module:test-suite everything related to our tests

Comments

@jnothman
Copy link
Member

Much of the common functionality across estimators is tested within sklearn.tests.test_common. As far as I can tell, there are no tests of what partial_fit should do in general, such as returning self. (I'm not sure what else is general to all estimators supporting partial_fit.) Test(s) should be added.

@jnothman jnothman added Bug Easy Well-defined and straightforward way to resolve and removed Bug labels Nov 28, 2014
@arjoly
Copy link
Member

arjoly commented Nov 28, 2014

Some possible invariance tests:

  1. Check that an error is raised if the number of features changes from one partial fit to another
  2. Check that doing fit -> partial fit == partial_fit, partial_fit
  3. Check that doing fit after a set of fit/partial_fit restart the estimator
  4. Check that classifier handles correctly the classes argument in the partial_fit.

@jnothman
Copy link
Member Author

I didn't see this before commening on #3907. Please note that there I
express doubts about the fit() -> partial_fit() semantics, but agree that
whatever we choose, it should be consistent across estimators.

On 28 November 2014 at 18:25, Arnaud Joly notifications@github.com wrote:

Some possible invariance tests:

  1. Check that an error is raised if the number of features changes
    from one partial fit to another
  2. Check that doing fit -> partial fit == partial_fit, partial_fit
  3. Check that doing fit after a set of fit/partial_fit restart the
    estimator
  4. Check that classifier handles correctly the classes.


Reply to this email directly or view it on GitHub
#3896 (comment)
.

@ogrisel
Copy link
Member

ogrisel commented Dec 30, 2014

Check that doing fit -> partial fit == partial_fit, partial_fit

This is not true: a call to fit can go do several passes over the provided until it reaches convergence on that specific piece of training data, whereas a call to partial_fit will in general to an small incremental update assuming that there is more data to come and that there is no use trying to overfit this specific chunk.

@amueller
Copy link
Member

@ogrisel extensive discussion here: #3907

@raghavrv
Copy link
Member

Which is waiting for #4841 to get (completed :P) and merged.

@thomasjpfan thomasjpfan added module:test-suite everything related to our tests Hard Hard level of difficulty and removed Easy Well-defined and straightforward way to resolve labels Feb 27, 2022
@adrinjalali
Copy link
Member

We have quite a few tests in common tests now for partial_fit. Closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Hard Hard level of difficulty module:test-suite everything related to our tests
Projects
None yet
7 participants