[WIP] New assert helpers for model comparison and fit reset checks #4841

raghavrv · 2015-06-10T12:27:55Z

Split from #3907 and #4162

New helpers for model comparison - assert_same_model, assert_not_same_model, assert_fitted_attributes_equal and assert_safe_sparse_allclose.
Check if all estimators reset upon fit

TODO

assert_safe_sparse_allcose - To support sparse/dense matrices. (naming inspired from safe_sparse_dot)
assert_same_model / assert_not_same_model / assert_fitted_attributes_equal
Unit tests for the assert helpers.
Check to make sure estimator resets when fit.

Partially fixes : #406

TomDLT · 2015-06-10T13:20:48Z

sklearn/utils/testing.py

-                             "'regressor', 'transformer', 'cluster' or None, got"
-                             " %s." % repr(type_filter))
+                             "'regressor', 'transformer', 'cluster' or None,"
+                             "got %s." % repr(type_filter))


you forgot a space between , and got

raghavrv · 2015-06-10T13:35:28Z

@TomDLT done :)

jnothman · 2015-06-10T13:46:19Z

I had meant that you might propose this in #4162 where at least it has one motivating example. We'll see how this is received.

jnothman · 2015-06-10T13:47:30Z

sklearn/utils/testing.py

+        # Check if the method(X) returns the same for both models.
+        res1 = getattr(estimator1, method)(X)
+        res2 = getattr(estimator2, method)(X)
+        same_model = (res1.shape == res2.shape) and np.allclose(res1, res2)


This will not handle transforms with sparse output.

raghavrv · 2015-06-10T14:07:31Z

Thanks for the quick review!

I had meant that you might propose this in #4162 where at least it has one motivating example. We'll see how this is received.

Oh :P Anyway I'll rebase #4162 upon this! So this can be reviewed separately... :)

~~BTW do you feel we should add a test under estimator_checks to make sure this works for all the estimators?~~ Sorry that is more or less what #4162 does!

jnothman · 2015-06-10T14:12:12Z

Within limits, check_fit_reset will check this for all estimators. The
limits are: estimators with 2d float array input to fit, and only their
behaviour in that case (i.e. not testing sparse input effects)`.

On 11 June 2015 at 00:08, Raghav R V notifications@github.com wrote:

Thanks for the quick review!

I had meant that you might propose this in #4162
#4162 where at least
it has one motivating example. We'll see how this is received.

Oh :P Anyway I'll rebase #4162
#4162 upon this! So
this can be reviewed separately... :)

BTW do you feel we should add a test under estimator_checks to make sure
this works for all the estimators?

—
Reply to this email directly or view it on GitHub
#4841 (comment)
.

raghavrv · 2015-06-10T14:14:53Z

i.e. not testing sparse input effects

Ok! so a separate test for sparse alone will suffice apart from having #4162 and #3907 right?

jnothman · 2015-06-10T14:25:17Z

Might I hedge my bets and say "for now"?

On 11 June 2015 at 00:15, Raghav R V notifications@github.com wrote:

i.e. not testing sparse input effects

Ok! so a separate test for sparse alone will suffice apart from having
#4162 #4162 and #3907
#3907 right?

—
Reply to this email directly or view it on GitHub
#4841 (comment)
.

raghavrv · 2015-06-10T14:25:30Z

haha okay :)

vene · 2015-06-14T18:14:10Z

sklearn/utils/testing.py

+                            verbose=True):
+    """Check if two sparse arrays are equal up to the desired tolerance.
+
+    It compares the difference between `actual` and `desired` to


The absolute difference, right?

Do you mean absolute difference as in the absolute difference between -1 and 1 is 0?
(I am pretty sure I misunderstood what u said... could you please expand a bit?)

The test compares the absolute value of the difference to the thing you mention. Not the difference. For more information take a look at the documentation of np.allclose.

That helps! Thanks

vene · 2015-06-14T23:06:55Z

I agree with @jnothman that it's hard to review this PR as it is, because it's not clear how reusable these helpers are across the codebase. Are there any places in the existing tests where these helpers could be used?

Out of context as they are now, it's hard to say whether the API and implementation choices in this PR are the right ones.

vene · 2015-06-14T23:11:26Z

sklearn/utils/tests/test_testing.py

+
+
+def test_qda_same_model():
+    # NRT to make sure the rotations_ attribute is correctly compared


Does this mean non-regression test? In general, could you document tests with docstrings with clear first lines?

For this test in particular, I don't see what it would test that the one above wouldn't, as LinearSVC and KMeans also have fitted attributes of their own. Am I missing anything?

Sorry for the lack of clarity in the comment... @amueller had pointed out in one comment that QDA was not correctly compared by the previous implementation of the assert_same_model. That was because rotations was a list of numpy arrays and so I thought it would be worthwhile to add a NRT for QDA alone!

Will update the comment to make it more clear!

Good, I think the docstring should explain this.

I would also suggest using stub estimators to more explicitly test these helpers. You could test this with something like

class Dummy(object): pass def test_compare_attributes(): a = Dummy() also_a = Dummy() not_a = Dummy() a.foo_ = [3, 10] also_a.foo_ = [3, 10] not_a.foo_ = [42] # assert a and also_a are the same model and not_a is not

This would make both the test more self-descriptive and any failures easier to pinpoint. You should probably use the classes from tests/test_base.py

vene · 2015-06-14T23:15:48Z

Sorry, I was pretty chaotic in reviewing this, since I jumped from minor nitpicks to high-level concerns and back. I think it'd be best to discuss the high-level things first, just in case we decide to implement things completely differently. For example as Joel suggested we might want to just densify sparse attributes and leverage assert_allclose.

raghavrv · 2015-11-13T14:23:15Z

sklearn/utils/testing.py

+            assert False, msg
+        assert_allclose(val1, val2, rtol=rtol, atol=atol, err_msg=msg)
+    else:
+        assert False, msg


@vene @jnothman Could you look at this implementation once? (This is still WIP as 30% of the tests (fit reset tests) don't pass... but I'd like to know if I am going in the right direction)

TST Add tests for the new assert helpers

raghavrv · 2016-08-03T12:42:30Z

@jnothman Any suggestions on how to salvage this? Should we proceed on this?

betatim · 2016-08-29T20:48:06Z

I'd be interested in the assert_same_model for #7270. Do you want to keep working on this or should I pick that method (and its tests) out of this PR?

raghavrv · 2016-08-29T20:52:07Z

Please go ahead and pick it up if you wish to :) Thanks for working on that!

jnothman · 2016-08-29T22:08:04Z

I think you'll find getting this to merge is a bit of an uphill battle. In your case, Tim (and perhaps a variant applies in general), I also wonder whether pickle equality is what you want...

jnothman · 2017-06-04T13:49:00Z

I'd like some discussion of this, perhaps at the sprint. Being able to easily asset that models are equal could simplify many tests or make them more rigorous

betatim · 2017-06-05T06:30:02Z

(I've not done any work on this, if someone else wants to go ahead please do!)

raghavrv · 2017-06-05T07:03:02Z

I can revive this if someone is interested in reviewing it... @jnothman is interested. @vene would you be available for the 2nd review?

vene · 2017-06-05T07:10:36Z

Sure, thanks for finding a way to ease me in to the sprint :)

…

-------- Original Message -------- From: "(Venkat) Raghav (Rajagopalan)" <notifications@github.com> Sent: June 5, 2017 9:03:04 AM GMT+02:00 To: scikit-learn/scikit-learn <scikit-learn@noreply.github.com> Cc: Vlad Niculae <vlad@vene.ro>, Mention <mention@noreply.github.com> Subject: Re: [scikit-learn/scikit-learn] [WIP] New assert helpers for model comparison and fit reset checks (#4841) I can revive this if someone is interested in reviewing it... @jnothman is interested. @vene would you be available for the 2nd review?

jnothman · 2017-06-05T08:12:50Z

I think that this has real value for estimator developers, has the potential to strengthen lots of tests, and will allow us to write tests more clearly without caring what type of estimator we are applying the assertion to. Its main downside is that it may be overkill and slow, that it requires us to define what we mean by equivalent after the fact in a way that may not be universal with respect to existing and future estimators, and that it requires a lot of messy data structure traversal. Overall I think it's the right way to go.

…

On 5 Jun 2017 5:10 pm, "Vlad Niculae" ***@***.***> wrote: Sure, thanks for finding a way to ease me in to the sprint :) -------- Original Message -------- From: "(Venkat) Raghav (Rajagopalan)" ***@***.***> Sent: June 5, 2017 9:03:04 AM GMT+02:00 To: scikit-learn/scikit-learn ***@***.***> Cc: Vlad Niculae ***@***.***>, Mention ***@***.***> Subject: Re: [scikit-learn/scikit-learn] [WIP] New assert helpers for model comparison and fit reset checks (#4841) I can revive this if someone is interested in reviewing it... @jnothman is interested. @vene would you be available for the 2nd review? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#4841 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz6xa18sZdMMRRJLyka74p_qMefLYlks5sA6nugaJpZM4E-XR4> .

amueller · 2017-06-05T13:18:39Z

Last time I checked, I felt it was not worth the effort, and I haven't thought about it since. Can someone maybe give use-cases?
And defining what equality of estimators is is just really hard.... Do we have a definition? Do all the private attributes need to be the same?

haiatn · 2023-07-29T12:24:56Z

I like the thinking that went behind this but this seems complicated, I don't know we're at a point where this is needed yet

adrinjalali · 2023-07-30T10:13:30Z

At this point if we're going to define estimator equality we probably needs a SLEP, and I haven't seen much of a usecase for it. Happy to have a fresh discussion on a SLEP of folks are interested though.

raghavrv force-pushed the new_assert_helpers branch from 3b8e8f7 to e70af31 Compare June 10, 2015 13:15

raghavrv changed the title ~~[WIP] New assert helpers~~ [MRG] New assert helpers Jun 10, 2015

TomDLT reviewed Jun 10, 2015
View reviewed changes

raghavrv force-pushed the new_assert_helpers branch from e70af31 to 53a7637 Compare June 10, 2015 13:35

jnothman reviewed Jun 10, 2015
View reviewed changes

raghavrv mentioned this pull request Jun 10, 2015

[WIP] TST Add test to check if estimators reset model when fit is called #4162

Closed

raghavrv force-pushed the new_assert_helpers branch from 53a7637 to 3341afb Compare June 13, 2015 21:08

raghavrv changed the title ~~[MRG] New assert helpers~~ [WIP] New assert helpers Jun 13, 2015

vene reviewed Jun 14, 2015
View reviewed changes

raghavrv reviewed Nov 13, 2015
View reviewed changes

raghavrv added 4 commits August 3, 2016 14:41

ENH/TST Add helpers assert_{same_model|fitted_attributes_equal}

ad18ddd

TST Add tests for the new assert helpers

TST Add test to check if estimators reset upon fit

64cefb8

FIX Shift the points instead of taking abs to preserve blobiness

61e98d3

WIP + SCAFFOLD_REMOVE_BEFORE_MERGE

35fdeaa

raghavrv force-pushed the new_assert_helpers branch from 7f8b5a5 to 35fdeaa Compare August 3, 2016 12:41

jnothman mentioned this pull request Aug 29, 2016

[WIP] Test determinism of estimators #7270

Closed

5 tasks

raghavrv mentioned this pull request Dec 7, 2016

[MRG+3] FIX Memory leak in MAE; Use safe_realloc; Acquire GIL only when raising; Propagate all errors to python interpreter level (#7811) #8002

Merged

jnothman added the Sprint label Jun 4, 2017

jnothman added Need Contributor Stalled and removed Sprint labels Aug 30, 2017

lesteve added help wanted and removed Need Contributor labels Oct 18, 2017

jnothman mentioned this pull request Oct 8, 2018

[MRG] Added check for idempotence of fit() #12328

Merged

github-actions bot added the module:utils label Mar 2, 2020

Base automatically changed from master to main January 22, 2021 10:48

adrinjalali closed this Jul 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] New assert helpers for model comparison and fit reset checks #4841

[WIP] New assert helpers for model comparison and fit reset checks #4841

raghavrv commented Jun 10, 2015

TomDLT Jun 10, 2015

raghavrv Jun 10, 2015

raghavrv commented Jun 10, 2015

jnothman commented Jun 10, 2015

jnothman Jun 10, 2015

raghavrv Jun 16, 2015

raghavrv commented Jun 10, 2015

jnothman commented Jun 10, 2015

raghavrv commented Jun 10, 2015

jnothman commented Jun 10, 2015

raghavrv commented Jun 10, 2015

vene Jun 14, 2015

raghavrv Jun 14, 2015

vene Jun 14, 2015

raghavrv Jun 14, 2015

vene commented Jun 14, 2015

vene Jun 14, 2015

raghavrv Jun 14, 2015

vene Jun 15, 2015

vene commented Jun 14, 2015

raghavrv Nov 13, 2015

raghavrv commented Aug 3, 2016

betatim commented Aug 29, 2016

raghavrv commented Aug 29, 2016

jnothman commented Aug 29, 2016

jnothman commented Jun 4, 2017

betatim commented Jun 5, 2017

raghavrv commented Jun 5, 2017

vene commented Jun 5, 2017 via email

jnothman commented Jun 5, 2017 via email

amueller commented Jun 5, 2017

haiatn commented Jul 29, 2023

adrinjalali commented Jul 30, 2023



		def test_qda_same_model():
		# NRT to make sure the rotations_ attribute is correctly compared

[WIP] New assert helpers for model comparison and fit reset checks #4841

[WIP] New assert helpers for model comparison and fit reset checks #4841

Conversation

raghavrv commented Jun 10, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raghavrv commented Jun 10, 2015

jnothman commented Jun 10, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raghavrv commented Jun 10, 2015

jnothman commented Jun 10, 2015

raghavrv commented Jun 10, 2015

jnothman commented Jun 10, 2015

raghavrv commented Jun 10, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vene commented Jun 14, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vene commented Jun 14, 2015

Choose a reason for hiding this comment

raghavrv commented Aug 3, 2016

betatim commented Aug 29, 2016

raghavrv commented Aug 29, 2016

jnothman commented Aug 29, 2016

jnothman commented Jun 4, 2017

betatim commented Jun 5, 2017

raghavrv commented Jun 5, 2017

vene commented Jun 5, 2017 via email

jnothman commented Jun 5, 2017 via email

amueller commented Jun 5, 2017

haiatn commented Jul 29, 2023

adrinjalali commented Jul 30, 2023