Skip to content

[MRG + 1] move custom error/warning classes into sklearn.exceptions (and move deprecated away from utils.__init__.py) #4826

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Oct 20, 2015

Conversation

raghavrv
Copy link
Member

@raghavrv raghavrv commented Jun 5, 2015

Hijacks into @larsmans' #4309

(@larsmans apologies for proceeding without ur reply... I thought this would be nice to have while fixing #2904 since 2 / 5 classes were in cross_validation.py and grid_search.py)

BTW I left out the utils.arpack.ArpackError and utils.tests.test_estimator_checks.CorrectNotFittedError.

Please review @amueller @larsmans @ogrisel @agramfort :)

@raghavrv raghavrv force-pushed the exceptions branch 4 times, most recently from 74a6ef6 to b9faf05 Compare June 6, 2015 15:27
@raghavrv
Copy link
Member Author

raghavrv commented Jun 6, 2015

@larsmans Thanks for the review!! I assumed NonBLASDotWarning to have been considered as a part of utills that is not explicitly exposed for public use and hence skipped deprecating it... Please let me know if anything else needs to be done...

Also I agree that NonBLASDotWarning could just be kept as a private class at utils.extmath itself.... Please let me know if I should do so? (also @amueller @agramfort @ogrisel your views on the same? :) )

@amueller
Copy link
Member

amueller commented Jun 6, 2015

Why did you rename the NonBlasDot warning?

@raghavrv
Copy link
Member Author

raghavrv commented Jun 6, 2015

Lars felt NonBLASDotWarning need not be exposed as such and could perhaps be renamed to EfficiencyWarning before exposing, if needed...

Can I keep it as NonBLASWarning and not expose it in exceptions.py since utils.extmath is not public?

@amueller
Copy link
Member

amueller commented Jun 6, 2015

I didn't see his comment. it's fine then. Should we deprecate? I have no strong opinion and I'd be fine with merging as-is.

@raghavrv
Copy link
Member Author

raghavrv commented Jun 6, 2015

Thanks for the review! :)

@larsmans One final look at this?

@raghavrv
Copy link
Member Author

raghavrv commented Jun 8, 2015

@jnothman @agramfort Could I trouble you for a review? :)

@amueller I'll count your earlier comment as a +1 ? More work to be done here ;)

"""


class FitFailedWarning(RuntimeWarning):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When taken out of context of a particular module, these need docstrings to explain when they should be used or expected. It may even be appropriate to add these to doc/modules/classes.rst as part of the public API.

@larsmans larsmans changed the title MAINT move custom error/warning classes into sklearn.exceptions [WIP] move custom error/warning classes into sklearn.exceptions Jun 8, 2015
@raghavrv raghavrv force-pushed the exceptions branch 2 times, most recently from d62316b to 66cedb6 Compare June 10, 2015 09:43
@raghavrv
Copy link
Member Author

Can I skip the doc tests for the added examples?

>>> import warnings
>>> warnings.simplefilter('always', ChangedBehaviorWarning)
>>> gs = GridSearchCV(estimator=LinearSVC(random_state=0),
param_grid={'C': [1, 10, 100]}, scoring='f1_micro')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are missing ... here.

@amueller
Copy link
Member

the current doctests fail not because of raising warning but because of errors in the doctests.

@raghavrv
Copy link
Member Author

@amueller Thanks for the review! :)

Also do you feel the Examples are fine? Or should they be written from a devs perspective (As an example for where (s)he'd use the Error/Warning...)?

@raghavrv raghavrv force-pushed the exceptions branch 5 times, most recently from 9156de2 to e2dd2a1 Compare June 11, 2015 13:23
@raghavrv
Copy link
Member Author

Please let me know if there is anything else to be done! Also do we need a user guide for the exceptions module?

"""


# TODO Import from model_selection
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this TODO?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove!

ENH NonBLASDotWarning -> EfficiencyWarning; Improve error message
DOC Add exceptions module to modules/classes.rst
MAINT Move ConvergenceWarning, UndefinedMetricWarning et al into exceptions
MAINT Remove ChangedBehaviorWarning from base
DOC/FIX Improve DataConversionWarning's docstring
@raghavrv
Copy link
Member Author

Done! @GaelVaroquaux and @pletelli could you please review? :)

@GaelVaroquaux
Copy link
Member

Travis failed!

@raghavrv
Copy link
Member Author

Hmm Interesting failure... Looks like deprecating them has side effects that propagate back to the sklearn.exceptions module...

@raghavrv
Copy link
Member Author

I think the last commit should fix it... Its not an ugly hack I feel...

@GaelVaroquaux
Copy link
Member

Still failing.

Are you running the tests on your computer? This might save you some time.

@raghavrv
Copy link
Member Author

Apologies! I got a bit overconfident and assumed it would pass :/ (This version passes on my machine!)

@raghavrv
Copy link
Member Author

and on travis too :)

"""


class UndefinedMetricWarning(UserWarning):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this guy should have a description.

@GaelVaroquaux
Copy link
Member

👍

Merging. Good job!

GaelVaroquaux added a commit that referenced this pull request Oct 20, 2015
[MRG + 1] move custom error/warning classes into sklearn.exceptions (and move `deprecated` away from `utils.__init__.py`)
@GaelVaroquaux GaelVaroquaux merged commit d4e9d79 into scikit-learn:master Oct 20, 2015
@raghavrv
Copy link
Member Author

Wait.. Description for undefined metric warning?

@raghavrv
Copy link
Member Author

Anyway thanks for the review and merge :)

@GaelVaroquaux
Copy link
Member

GaelVaroquaux commented Oct 20, 2015 via email

@raghavrv raghavrv deleted the exceptions branch October 20, 2015 13:29
@raghavrv
Copy link
Member Author

Done and merged at #5478

raghavrv added a commit to raghavrv/scikit-learn that referenced this pull request Oct 21, 2015
Squashed commit messages - (For reference)

Major
-----

* ENH p --> n_labels
* FIX *ShuffleSplit: all float/invalid type errors at init and int error at split
* FIX make PredefinedSplit accept test_folds in constructor; Cleanup docstrings
* ENH+TST KFold: make rng to be generated at every split call for reproducibility
* FIX/MAINT KFold: make shuffle a public attr
* FIX Make CVIterableWrapper private.
* FIX reuse len_cv instead of recalculating it
* FIX Prevent adding *SearchCV estimators from the old grid_search module
* re-FIX In all_estimators: the sorting to use only the 1st item (name)
    To avoid collision between the old and the new GridSearch classes.
* FIX test_validate.py: Use 2D X (1D X is being detected as a single sample)
* MAINT validate.py --> validation.py
* MAINT make the submodules private
* MAINT Support old cv/gs/lc until 0.19
* FIX/MAINT n_splits --> get_n_splits
* FIX/TST test_logistic.py/test_ovr_multinomial_iris:
    pass predefined folds as an iterable
* MAINT expose BaseCrossValidator
* Update the model_selection module with changes from master
  - From scikit-learn#5161
  -  - MAINT remove redundant p variable
  -  - Add check for sparse prediction in cross_val_predict
  - From scikit-learn#5201 - DOC improve random_state param doc
  - From scikit-learn#5190 - LabelKFold and test
  - From scikit-learn#4583 - LabelShuffleSplit and tests
  - From scikit-learn#5300 - shuffle the `labels` not the `indxs` in LabelKFold + tests
  - From scikit-learn#5378 - Make the GridSearchCV docs more accurate.
  - From scikit-learn#5458 - Remove shuffle from LabelKFold
  - From scikit-learn#5466(scikit-learn#4270) - Gaussian Process by Jan Metzen
  - From scikit-learn#4826 - Move custom error / warnings into sklearn.exception

Minor
-----

* ENH Make the KFold shuffling test stronger
* FIX/DOC Use the higher level model_selection module as ref
* DOC in check_cv "y : array-like, optional"
* DOC a supervised learning problem --> supervised learning problems
* DOC cross-validators --> cross-validation strategies
* DOC Correct Olivier Grisel's name ;)
* MINOR/FIX cv_indices --> kfold
* FIX/DOC Align the 'See also' section of the new KFold, LeaveOneOut
* TST/FIX imports on separate lines
* FIX use __class__ instead of classmethod
* TST/FIX import directly from model_selection
* COSMIT Relocate the random_state documentation
* COSMIT remove pass
* MAINT Remove deprecation warnings from old tests
* FIX correct import at test_split
* FIX/MAINT Move P_sparse, X, y defns to top; rm unused W_sparse, X_sparse
* FIX random state to avoid doctest failure
* TST n_splits and split wrapping of _CVIterableWrapper
* FIX/MAINT Use multilabel indicator matrix directly
* TST/DOC clarify why we conflate classes 0 and 1
* DOC add comment that this was taken from BaseEstimator
* FIX use of labels is not needed in stratified k fold
* Fix cross_validation reference
* Fix the labels param doc
raghavrv added a commit to raghavrv/scikit-learn that referenced this pull request Oct 21, 2015
Squashed commit messages - (For reference)

Major
-----

* ENH p --> n_labels
* FIX *ShuffleSplit: all float/invalid type errors at init and int error at split
* FIX make PredefinedSplit accept test_folds in constructor; Cleanup docstrings
* ENH+TST KFold: make rng to be generated at every split call for reproducibility
* FIX/MAINT KFold: make shuffle a public attr
* FIX Make CVIterableWrapper private.
* FIX reuse len_cv instead of recalculating it
* FIX Prevent adding *SearchCV estimators from the old grid_search module
* re-FIX In all_estimators: the sorting to use only the 1st item (name)
    To avoid collision between the old and the new GridSearch classes.
* FIX test_validate.py: Use 2D X (1D X is being detected as a single sample)
* MAINT validate.py --> validation.py
* MAINT make the submodules private
* MAINT Support old cv/gs/lc until 0.19
* FIX/MAINT n_splits --> get_n_splits
* FIX/TST test_logistic.py/test_ovr_multinomial_iris:
    pass predefined folds as an iterable
* MAINT expose BaseCrossValidator
* Update the model_selection module with changes from master
  - From scikit-learn#5161
  -  - MAINT remove redundant p variable
  -  - Add check for sparse prediction in cross_val_predict
  - From scikit-learn#5201 - DOC improve random_state param doc
  - From scikit-learn#5190 - LabelKFold and test
  - From scikit-learn#4583 - LabelShuffleSplit and tests
  - From scikit-learn#5300 - shuffle the `labels` not the `indxs` in LabelKFold + tests
  - From scikit-learn#5378 - Make the GridSearchCV docs more accurate.
  - From scikit-learn#5458 - Remove shuffle from LabelKFold
  - From scikit-learn#5466(scikit-learn#4270) - Gaussian Process by Jan Metzen
  - From scikit-learn#4826 - Move custom error / warnings into sklearn.exception

Minor
-----

* ENH Make the KFold shuffling test stronger
* FIX/DOC Use the higher level model_selection module as ref
* DOC in check_cv "y : array-like, optional"
* DOC a supervised learning problem --> supervised learning problems
* DOC cross-validators --> cross-validation strategies
* DOC Correct Olivier Grisel's name ;)
* MINOR/FIX cv_indices --> kfold
* FIX/DOC Align the 'See also' section of the new KFold, LeaveOneOut
* TST/FIX imports on separate lines
* FIX use __class__ instead of classmethod
* TST/FIX import directly from model_selection
* COSMIT Relocate the random_state documentation
* COSMIT remove pass
* MAINT Remove deprecation warnings from old tests
* FIX correct import at test_split
* FIX/MAINT Move P_sparse, X, y defns to top; rm unused W_sparse, X_sparse
* FIX random state to avoid doctest failure
* TST n_splits and split wrapping of _CVIterableWrapper
* FIX/MAINT Use multilabel indicator matrix directly
* TST/DOC clarify why we conflate classes 0 and 1
* DOC add comment that this was taken from BaseEstimator
* FIX use of labels is not needed in stratified k fold
* Fix cross_validation reference
* Fix the labels param doc
raghavrv added a commit to raghavrv/scikit-learn that referenced this pull request Oct 22, 2015
Squashed commit messages - (For reference)

Major
-----

* ENH p --> n_labels
* FIX *ShuffleSplit: all float/invalid type errors at init and int error at split
* FIX make PredefinedSplit accept test_folds in constructor; Cleanup docstrings
* ENH+TST KFold: make rng to be generated at every split call for reproducibility
* FIX/MAINT KFold: make shuffle a public attr
* FIX Make CVIterableWrapper private.
* FIX reuse len_cv instead of recalculating it
* FIX Prevent adding *SearchCV estimators from the old grid_search module
* re-FIX In all_estimators: the sorting to use only the 1st item (name)
    To avoid collision between the old and the new GridSearch classes.
* FIX test_validate.py: Use 2D X (1D X is being detected as a single sample)
* MAINT validate.py --> validation.py
* MAINT make the submodules private
* MAINT Support old cv/gs/lc until 0.19
* FIX/MAINT n_splits --> get_n_splits
* FIX/TST test_logistic.py/test_ovr_multinomial_iris:
    pass predefined folds as an iterable
* MAINT expose BaseCrossValidator
* Update the model_selection module with changes from master
  - From scikit-learn#5161
  -  - MAINT remove redundant p variable
  -  - Add check for sparse prediction in cross_val_predict
  - From scikit-learn#5201 - DOC improve random_state param doc
  - From scikit-learn#5190 - LabelKFold and test
  - From scikit-learn#4583 - LabelShuffleSplit and tests
  - From scikit-learn#5300 - shuffle the `labels` not the `indxs` in LabelKFold + tests
  - From scikit-learn#5378 - Make the GridSearchCV docs more accurate.
  - From scikit-learn#5458 - Remove shuffle from LabelKFold
  - From scikit-learn#5466(scikit-learn#4270) - Gaussian Process by Jan Metzen
  - From scikit-learn#4826 - Move custom error / warnings into sklearn.exception

Minor
-----

* ENH Make the KFold shuffling test stronger
* FIX/DOC Use the higher level model_selection module as ref
* DOC in check_cv "y : array-like, optional"
* DOC a supervised learning problem --> supervised learning problems
* DOC cross-validators --> cross-validation strategies
* DOC Correct Olivier Grisel's name ;)
* MINOR/FIX cv_indices --> kfold
* FIX/DOC Align the 'See also' section of the new KFold, LeaveOneOut
* TST/FIX imports on separate lines
* FIX use __class__ instead of classmethod
* TST/FIX import directly from model_selection
* COSMIT Relocate the random_state documentation
* COSMIT remove pass
* MAINT Remove deprecation warnings from old tests
* FIX correct import at test_split
* FIX/MAINT Move P_sparse, X, y defns to top; rm unused W_sparse, X_sparse
* FIX random state to avoid doctest failure
* TST n_splits and split wrapping of _CVIterableWrapper
* FIX/MAINT Use multilabel indicator matrix directly
* TST/DOC clarify why we conflate classes 0 and 1
* DOC add comment that this was taken from BaseEstimator
* FIX use of labels is not needed in stratified k fold
* Fix cross_validation reference
* Fix the labels param doc
raghavrv added a commit to raghavrv/scikit-learn that referenced this pull request Oct 22, 2015
Squashed commit messages - (For reference)

Major
-----

* ENH p --> n_labels
* FIX *ShuffleSplit: all float/invalid type errors at init and int error at split
* FIX make PredefinedSplit accept test_folds in constructor; Cleanup docstrings
* ENH+TST KFold: make rng to be generated at every split call for reproducibility
* FIX/MAINT KFold: make shuffle a public attr
* FIX Make CVIterableWrapper private.
* FIX reuse len_cv instead of recalculating it
* FIX Prevent adding *SearchCV estimators from the old grid_search module
* re-FIX In all_estimators: the sorting to use only the 1st item (name)
    To avoid collision between the old and the new GridSearch classes.
* FIX test_validate.py: Use 2D X (1D X is being detected as a single sample)
* MAINT validate.py --> validation.py
* MAINT make the submodules private
* MAINT Support old cv/gs/lc until 0.19
* FIX/MAINT n_splits --> get_n_splits
* FIX/TST test_logistic.py/test_ovr_multinomial_iris:
    pass predefined folds as an iterable
* MAINT expose BaseCrossValidator
* Update the model_selection module with changes from master
  - From scikit-learn#5161
  -  - MAINT remove redundant p variable
  -  - Add check for sparse prediction in cross_val_predict
  - From scikit-learn#5201 - DOC improve random_state param doc
  - From scikit-learn#5190 - LabelKFold and test
  - From scikit-learn#4583 - LabelShuffleSplit and tests
  - From scikit-learn#5300 - shuffle the `labels` not the `indxs` in LabelKFold + tests
  - From scikit-learn#5378 - Make the GridSearchCV docs more accurate.
  - From scikit-learn#5458 - Remove shuffle from LabelKFold
  - From scikit-learn#5466(scikit-learn#4270) - Gaussian Process by Jan Metzen
  - From scikit-learn#4826 - Move custom error / warnings into sklearn.exception

Minor
-----

* ENH Make the KFold shuffling test stronger
* FIX/DOC Use the higher level model_selection module as ref
* DOC in check_cv "y : array-like, optional"
* DOC a supervised learning problem --> supervised learning problems
* DOC cross-validators --> cross-validation strategies
* DOC Correct Olivier Grisel's name ;)
* MINOR/FIX cv_indices --> kfold
* FIX/DOC Align the 'See also' section of the new KFold, LeaveOneOut
* TST/FIX imports on separate lines
* FIX use __class__ instead of classmethod
* TST/FIX import directly from model_selection
* COSMIT Relocate the random_state documentation
* COSMIT remove pass
* MAINT Remove deprecation warnings from old tests
* FIX correct import at test_split
* FIX/MAINT Move P_sparse, X, y defns to top; rm unused W_sparse, X_sparse
* FIX random state to avoid doctest failure
* TST n_splits and split wrapping of _CVIterableWrapper
* FIX/MAINT Use multilabel indicator matrix directly
* TST/DOC clarify why we conflate classes 0 and 1
* DOC add comment that this was taken from BaseEstimator
* FIX use of labels is not needed in stratified k fold
* Fix cross_validation reference
* Fix the labels param doc
raghavrv added a commit to raghavrv/scikit-learn that referenced this pull request Oct 23, 2015
Squashed commit messages - (For reference)

Major
-----

* ENH p --> n_labels
* FIX *ShuffleSplit: all float/invalid type errors at init and int error at split
* FIX make PredefinedSplit accept test_folds in constructor; Cleanup docstrings
* ENH+TST KFold: make rng to be generated at every split call for reproducibility
* FIX/MAINT KFold: make shuffle a public attr
* FIX Make CVIterableWrapper private.
* FIX reuse len_cv instead of recalculating it
* FIX Prevent adding *SearchCV estimators from the old grid_search module
* re-FIX In all_estimators: the sorting to use only the 1st item (name)
    To avoid collision between the old and the new GridSearch classes.
* FIX test_validate.py: Use 2D X (1D X is being detected as a single sample)
* MAINT validate.py --> validation.py
* MAINT make the submodules private
* MAINT Support old cv/gs/lc until 0.19
* FIX/MAINT n_splits --> get_n_splits
* FIX/TST test_logistic.py/test_ovr_multinomial_iris:
    pass predefined folds as an iterable
* MAINT expose BaseCrossValidator
* Update the model_selection module with changes from master
  - From scikit-learn#5161
  -  - MAINT remove redundant p variable
  -  - Add check for sparse prediction in cross_val_predict
  - From scikit-learn#5201 - DOC improve random_state param doc
  - From scikit-learn#5190 - LabelKFold and test
  - From scikit-learn#4583 - LabelShuffleSplit and tests
  - From scikit-learn#5300 - shuffle the `labels` not the `indxs` in LabelKFold + tests
  - From scikit-learn#5378 - Make the GridSearchCV docs more accurate.
  - From scikit-learn#5458 - Remove shuffle from LabelKFold
  - From scikit-learn#5466(scikit-learn#4270) - Gaussian Process by Jan Metzen
  - From scikit-learn#4826 - Move custom error / warnings into sklearn.exception

Minor
-----

* ENH Make the KFold shuffling test stronger
* FIX/DOC Use the higher level model_selection module as ref
* DOC in check_cv "y : array-like, optional"
* DOC a supervised learning problem --> supervised learning problems
* DOC cross-validators --> cross-validation strategies
* DOC Correct Olivier Grisel's name ;)
* MINOR/FIX cv_indices --> kfold
* FIX/DOC Align the 'See also' section of the new KFold, LeaveOneOut
* TST/FIX imports on separate lines
* FIX use __class__ instead of classmethod
* TST/FIX import directly from model_selection
* COSMIT Relocate the random_state documentation
* COSMIT remove pass
* MAINT Remove deprecation warnings from old tests
* FIX correct import at test_split
* FIX/MAINT Move P_sparse, X, y defns to top; rm unused W_sparse, X_sparse
* FIX random state to avoid doctest failure
* TST n_splits and split wrapping of _CVIterableWrapper
* FIX/MAINT Use multilabel indicator matrix directly
* TST/DOC clarify why we conflate classes 0 and 1
* DOC add comment that this was taken from BaseEstimator
* FIX use of labels is not needed in stratified k fold
* Fix cross_validation reference
* Fix the labels param doc
amueller pushed a commit that referenced this pull request Oct 23, 2015
--------------------

* ENH Reogranize classes/fn from grid_search into search.py
* ENH Reogranize classes/fn from cross_validation into split.py
* ENH Reogranize cls/fn from cross_validation/learning_curve into validate.py

* MAINT Merge _check_cv into check_cv inside the model_selection module
* MAINT Update all the imports to point to the model_selection module
* FIX use iter_cv to iterate throught the new style/old style cv objs
* TST Add tests for the new model_selection members
* ENH Wrap the old-style cv obj/iterables instead of using iter_cv

* ENH Use scipy's binomial coefficient function comb for calucation of nCk
* ENH Few enhancements to the split module
* ENH Improve check_cv input validation and docstring
* MAINT _get_test_folds(X, y, labels) --> _get_test_folds(labels)
* TST if 1d arrays for X introduce any errors
* ENH use 1d X arrays for all tests;
* ENH X_10 --> X (global var)

Minor
-----

* ENH _PartitionIterator --> _BaseCrossValidator;
* ENH CVIterator --> CVIterableWrapper
* TST Import the old SKF locally
* FIX/TST Clean up the split module's tests.
* DOC Improve documentation of the cv parameter
* COSMIT consistently hyphenate cross-validation/cross-validator
* TST Calculate n_samples from X
* COSMIT Use separate lines for each import.
* COSMIT cross_validation_generator --> cross_validator

Commits merged manually
-----------------------

* FIX Document the random_state attribute in RandomSearchCV
* MAINT Use check_cv instead of _check_cv
* ENH refactor OVO decision function, use it in SVC for sklearn-like
  decision_function shape
* FIX avoid memory cost when sampling from large parameter grids

ENH Major to Minor incremental enhancements to the model_selection

Squashed commit messages - (For reference)

Major
-----

* ENH p --> n_labels
* FIX *ShuffleSplit: all float/invalid type errors at init and int error at split
* FIX make PredefinedSplit accept test_folds in constructor; Cleanup docstrings
* ENH+TST KFold: make rng to be generated at every split call for reproducibility
* FIX/MAINT KFold: make shuffle a public attr
* FIX Make CVIterableWrapper private.
* FIX reuse len_cv instead of recalculating it
* FIX Prevent adding *SearchCV estimators from the old grid_search module
* re-FIX In all_estimators: the sorting to use only the 1st item (name)
    To avoid collision between the old and the new GridSearch classes.
* FIX test_validate.py: Use 2D X (1D X is being detected as a single sample)
* MAINT validate.py --> validation.py
* MAINT make the submodules private
* MAINT Support old cv/gs/lc until 0.19
* FIX/MAINT n_splits --> get_n_splits
* FIX/TST test_logistic.py/test_ovr_multinomial_iris:
    pass predefined folds as an iterable
* MAINT expose BaseCrossValidator
* Update the model_selection module with changes from master
  - From #5161
  -  - MAINT remove redundant p variable
  -  - Add check for sparse prediction in cross_val_predict
  - From #5201 - DOC improve random_state param doc
  - From #5190 - LabelKFold and test
  - From #4583 - LabelShuffleSplit and tests
  - From #5300 - shuffle the `labels` not the `indxs` in LabelKFold + tests
  - From #5378 - Make the GridSearchCV docs more accurate.
  - From #5458 - Remove shuffle from LabelKFold
  - From #5466(#4270) - Gaussian Process by Jan Metzen
  - From #4826 - Move custom error / warnings into sklearn.exception

Minor
-----

* ENH Make the KFold shuffling test stronger
* FIX/DOC Use the higher level model_selection module as ref
* DOC in check_cv "y : array-like, optional"
* DOC a supervised learning problem --> supervised learning problems
* DOC cross-validators --> cross-validation strategies
* DOC Correct Olivier Grisel's name ;)
* MINOR/FIX cv_indices --> kfold
* FIX/DOC Align the 'See also' section of the new KFold, LeaveOneOut
* TST/FIX imports on separate lines
* FIX use __class__ instead of classmethod
* TST/FIX import directly from model_selection
* COSMIT Relocate the random_state documentation
* COSMIT remove pass
* MAINT Remove deprecation warnings from old tests
* FIX correct import at test_split
* FIX/MAINT Move P_sparse, X, y defns to top; rm unused W_sparse, X_sparse
* FIX random state to avoid doctest failure
* TST n_splits and split wrapping of _CVIterableWrapper
* FIX/MAINT Use multilabel indicator matrix directly
* TST/DOC clarify why we conflate classes 0 and 1
* DOC add comment that this was taken from BaseEstimator
* FIX use of labels is not needed in stratified k fold
* Fix cross_validation reference
* Fix the labels param doc

FIX/DOC/MAINT Addressing the review comments by Arnaud and Andy

COSMIT Sort the members alphabetically
COSMIT len_cv --> n_splits
COSMIT Merge 2 if; FIX Use kwargs
DOC Add my name to the authors :D
DOC make labels parameter consistent
FIX Remove hack for boolean indices; + COSMIT idx --> indices; DOC Add Returns
COSMIT preds --> predictions
DOC Add Returns and neatly arrange X, y, labels
FIX idx(s)/ind(s)--> indice(s)
COSMIT Merge if and else to elif
COSMIT n --> n_samples
COSMIT Use bincount only once
COSMIT cls --> class_i / class_i (ith class indices) -->
perm_indices_class_i

FIX/ENH/TST Addressing the final reviews

COSMIT c --> count
FIX/TST make check_cv raise ValueError for string cv value
TST nested cv (gs inside cross_val_score) works for diff cvs
FIX/ENH Raise ValueError when labels is None for label based cvs;
TST if labels is being passed correctly to the cv and that the
ValueError is being propagated to the cross_val_score/predict and grid
search
FIX pass labels to cross_val_score
FIX use make_classification
DOC Add Returns; COSMIT Remove scaffolding
TST add a test to check the _build_repr helper
REVERT the old GS/RS should also be tested by the common tests.
ENH Add a tuple of all/label based CVS
FIX raise VE even at get_n_splits if labels is None
FIX Fabian's comments
PEP8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants