[MRG+1] MAINT Reuse unaltered classes/functions from model_selection #5568

raghavrv · 2015-10-23T16:38:53Z

Also clean up some of the cross_validation and grid_search tests.

NOTE: Lot more lines can be removed from more redundant tests, but I feel leaving them as such (which will cost a 3-4 extra seconds) will ensure that we haven't regressed on the old modules). To be more clear - if we remove a redundant test for say cross_val_score and later modify the model_selection's version of the function and test, people using the deprecated module will be surprised by that change since we simply import that from model_selection... Having the test will make sure that such a situation won't happen by failing the old tests and forcing us to leave a copy of the old implementation at the old module...

@amueller @vene @jnothman @GaelVaroquaux @ogrisel reviews?

jnothman · 2015-11-01T10:33:27Z

sklearn/grid_search.py

-from .utils.metaestimators import if_delegate_has_method
-from .metrics.scorer import check_scoring
-from .exceptions import ChangedBehaviorWarning
+from .model_selection.search import ParameterGrid


Shouldn't need to specify .search

jnothman · 2015-11-01T10:34:25Z

In general I approve of this cut-down on lines of code, and it also avoids people modifying the deprecated versions. There are test failures though.

raghavrv · 2015-11-01T15:59:47Z

@jnothman all the tests should pass now!

raghavrv · 2015-11-07T23:37:05Z

@jnothman :P

jnothman · 2015-11-08T02:37:31Z

raghavrv · 2015-11-10T12:35:03Z

hahaha 😆

MechCoder · 2016-02-08T20:39:07Z

sklearn/cross_validation.py

@@ -61,6 +62,16 @@
           'train_test_split']


+class FitFailedWarning(FitFailedWarning_):


This should be _FitFailedWarning according to your tab completion explanation before....

No this should be FitFailedWarning so people can import this just like before, but with a deprecation warning of course.

Oh, I meant FitFailedWarning_ -> _FitFailedWarning

Ah that yes.

raghavrv · 2016-02-08T22:20:49Z

@MechCoder Thanks heaps for the review!

raghavrv · 2016-02-11T15:19:35Z

Fixed both your comments!

lesteve · 2016-02-17T14:13:06Z

sklearn/cross_validation.py

 from abc import ABCMeta, abstractmethod

 import numpy as np
 import scipy.sparse as sp
+import sklearn.model_selection as mod_sel


You should use relative imports in modules (and absolute imports in tests), see http://scikit-learn.org/stable/developers/contributing.html#coding-guidelines.

You may as well move this import after the from scipy.misc import comb. As far as I can see imports tend to be roughly ordered like this: stdlib, numpy, scipy, sklearn.

Ah okay! Thanks...

We also frown on "import as" for non standard imports: it makes reading the code harder because people need to scroll up to understand what the name of the module means.

Thanks for the advice! (BTW if you can spare a few minutes, would you be able to review this? This has been around for a long time ;( )

raghavrv · 2016-02-18T12:48:39Z

I hope you meant to rename it to MRG+1 and not MRG+2, although it would have been awesome ;)

MechCoder · 2016-02-18T14:51:21Z

I thought @lesteve also gave a +1

lesteve · 2016-02-24T08:20:31Z

sklearn/tests/test_cross_validation.py

@@ -25,6 +25,8 @@
 with warnings.catch_warnings():
    warnings.simplefilter('ignore')
    from sklearn import cross_validation as cval
+    # We should be able to import this from cross_validation
+    from sklearn.cross_validation import FitFailedWarning as _FFW


You are not using _FFW anywhere ...

This is to test if FitFailedWarning can be imported from our old path (which was public and needs to be supported for atleast 2 more versions)

from sklearn.cross_validation import FitFailedWarning as _ would be better in that case then.

raghavrv · 2016-02-25T11:31:28Z

@lesteve Apart from these comments, could I consider the review a +1 from your side?

jnothman · 2016-02-25T11:34:52Z

I woke from my nap, it seems. In hindsight not sure it was worth the effort, but there are indeed a lot of lines saved! Not able to look through this in detail right now, but tests seem to pass...

raghavrv · 2016-02-25T11:43:24Z

Haha!! :D

And this PR basically does 90% of what we do after the deprecations expire for cross_validation...

One huge advantage is that when patching minor issues, we can do it once for the model_selection alone and not worry about fixing cross_validation.

lesteve · 2016-02-25T12:51:40Z

@lesteve Apart from these comments, could I consider the review a +1 from your side?

I don't have the necessary rights to change the PR title, so let's say it is a honorary +1 ;-).

One huge advantage is that when patching minor issues, we can do it once for the model_selection alone and not worry about fixing cross_validation.

I wholeheartedly agree on this one.

TST Clean up the cross_validation and grid_search tests. ENH Use scipy's binomial coefficient function comb for calucation of nCk

raghavrv · 2016-02-25T14:46:27Z

@lesteve Thanks again for the great catch! Have modified the test to make sure that the old (deprecated) FitFailedWarning is tested properly...

@amueller @ogrisel one final review please?

amueller · 2016-02-25T17:13:05Z

sklearn/cross_validation.py

-    if not np.all(hit):
-        return False
-    return True
+    return model_selection.cross_val_predict(


Is the signature of this one changed? Or why didn't you just import it? Though maybe this is more clear and allows changes in the new function.

I did this because the order of the labels parameter was changed in the new module... Previously it was (X, y, cv...). At model_selection, it is (X, y, labels...)...

amueller · 2016-02-25T17:20:39Z

looks good except we can't remove GridSearchCV because we want to make backward-incompatible changes to grid_scores_ for multiple metric support. An interesting question is whether we want to make similar backward-incompatible changes to cross_val_score and learning_curve, too.
I feel returning a dict in all cases would be easier in the multiple metric case.

raghavrv · 2016-03-29T17:56:57Z

looks good except we can't remove GridSearchCV because we want to make backward-incompatible changes to grid_scores_ for multiple metric support.

Yes you are correct! The main reason why the old tests are not removed is exactly that! To allow backward incompatible changes to any of model_selection classes and functions...

An interesting question is whether we want to make similar backward-incompatible changes to cross_val_score and learning_curve, too.

I feel we can merge this PR as such and re-add the old classes/functions with their old functionality (and ensure that the old tests pass) as and when we make more backward incompatible changes to the model_selection module.

The reason why I suggest that this PR be merged is that it is tough to anticipate in advance and retain the old code for all the functions/classes which could get a backward incompatible change in the future. I think it would be easier to copy the old code back in the PR (that does the backward incompatible change) itself.

WDYT? @amueller @jnothman @MechCoder

amueller · 2016-03-30T21:49:43Z

So you want to remove code and then add it back in later? Why?
I agree that it's hard to anticipate what we want to change, so I would just leave it in.

raghavrv · 2016-03-30T21:53:12Z

So you want to remove code and then add it back in later? Why?
I agree that it's hard to anticipate what we want to change, so I would just leave it in.

And revisit this PR at the end of all the model_selection changes?

amueller · 2016-03-30T22:09:59Z

Aka just before 0.18

raghavrv · 2016-09-14T14:14:17Z

Should we revisit this one now? @amueller

jnothman · 2016-09-14T14:19:29Z

I'm now +0 on this idea. I don't think there's anything broken here except for contributors changing deprecated files.

raghavrv · 2016-09-14T14:23:52Z

Okay. And there are a lot of changes too. I'm not sure how cleanly we can avoid code duplication... I'm closing this PR. Let me know if this needs to be done...

jnothman reviewed Nov 1, 2015
View reviewed changes

raghavrv force-pushed the reuse_mod_sel branch from 0f555b3 to 9c2b256 Compare November 1, 2015 15:54

raghavrv force-pushed the reuse_mod_sel branch from 9c2b256 to cb2e285 Compare November 3, 2015 14:35

raghavrv mentioned this pull request Nov 3, 2015

[MRG+1] Reduce warnings in the model_selection tests #5703

Closed

3 tasks

raghavrv force-pushed the reuse_mod_sel branch from cb2e285 to 9b947e1 Compare November 7, 2015 23:26

raghavrv force-pushed the reuse_mod_sel branch from 9b947e1 to da54321 Compare November 9, 2015 12:50

raghavrv mentioned this pull request Nov 23, 2015

[MRG] MAINT Reuse classes and functions to avoid code duplication rohanp/scikit-learn#1

Merged

raghavrv force-pushed the reuse_mod_sel branch from da54321 to 8698152 Compare November 23, 2015 12:43

raghavrv mentioned this pull request Dec 2, 2015

[MRG+1] Make cross-validators data independent + Reorganize grid_search, cross_validation and learning_curve into model_selection #4294

Merged

24 tasks

amueller added the Waiting for Reviewer label Dec 10, 2015

raghavrv force-pushed the reuse_mod_sel branch from 8698152 to 2674ebc Compare December 17, 2015 09:57

raghavrv force-pushed the reuse_mod_sel branch from 2674ebc to c34f5e4 Compare January 9, 2016 03:02

MechCoder reviewed Feb 8, 2016
View reviewed changes

raghavrv force-pushed the reuse_mod_sel branch from c34f5e4 to 97d283f Compare February 11, 2016 13:59

raghavrv force-pushed the reuse_mod_sel branch from 97d283f to a3de749 Compare February 11, 2016 15:21

raghavrv mentioned this pull request Feb 17, 2016

[MRG+1] fix StratifiedShuffleSplit train and test overlap #6379

Merged

lesteve reviewed Feb 17, 2016
View reviewed changes

raghavrv force-pushed the reuse_mod_sel branch 2 times, most recently from 121612c to 9266ca4 Compare February 17, 2016 14:30

raghavrv force-pushed the reuse_mod_sel branch from 9266ca4 to 7756278 Compare February 18, 2016 12:47

raghavrv changed the title ~~[MRG+2] MAINT Reuse unaltered classes/functions from model_selection~~ [MRG+1] MAINT Reuse unaltered classes/functions from model_selection Feb 18, 2016

lesteve reviewed Feb 24, 2016
View reviewed changes

MAINT Reuse unaltered classes/functions from model_selection

2dedf51

TST Clean up the cross_validation and grid_search tests. ENH Use scipy's binomial coefficient function comb for calucation of nCk

raghavrv force-pushed the reuse_mod_sel branch from 02bcfab to d53d472 Compare February 25, 2016 14:43

FIX/TST Make sure the old FitFailedWarning is tested for comparability

982888f

raghavrv force-pushed the reuse_mod_sel branch from d53d472 to 982888f Compare February 25, 2016 14:46

amueller reviewed Feb 25, 2016
View reviewed changes

raghavrv closed this Sep 14, 2016

raghavrv deleted the reuse_mod_sel branch September 14, 2016 14:25

		@@ -61,6 +62,16 @@
		'train_test_split']


		class FitFailedWarning(FitFailedWarning_):

[MRG+1] MAINT Reuse unaltered classes/functions from model_selection #5568

[MRG+1] MAINT Reuse unaltered classes/functions from model_selection #5568

Conversation

raghavrv commented Oct 23, 2015

Choose a reason for hiding this comment

jnothman commented Nov 1, 2015

raghavrv commented Nov 1, 2015

raghavrv commented Nov 7, 2015

jnothman commented Nov 8, 2015

raghavrv commented Nov 10, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raghavrv commented Feb 8, 2016

raghavrv commented Feb 11, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raghavrv commented Feb 18, 2016

MechCoder commented Feb 18, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raghavrv commented Feb 25, 2016

jnothman commented Feb 25, 2016

raghavrv commented Feb 25, 2016

lesteve commented Feb 25, 2016

raghavrv commented Feb 25, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amueller commented Feb 25, 2016

raghavrv commented Mar 29, 2016

amueller commented Mar 30, 2016

raghavrv commented Mar 30, 2016

amueller commented Mar 30, 2016

raghavrv commented Sep 14, 2016

jnothman commented Sep 14, 2016

raghavrv commented Sep 14, 2016