Skip to content

DOC: Clarify cv parameter description in GridSearchCV #12495

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Nov 12, 2018
Merged
15 changes: 15 additions & 0 deletions doc/modules/cross_validation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,21 @@ validation iterator instead, for instance::
>>> cross_val_score(clf, iris.data, iris.target, cv=cv) # doctest: +ELLIPSIS
array([0.977..., 0.977..., 1. ..., 0.955..., 1. ])

Another option is to use an iterable yielding (train, test) splits as arrays of
indices, for example::

>>> def custom_cv_2folds(X):
... n = X.shape[0]
... i = 1
... while i <= 2:
... idx = np.arange(n * (i - 1) / 2, n * i / 2, dtype=int)
... yield idx, idx
... i += 1
...
>>> custom_cv = custom_cv_2folds(iris.data)
>>> cross_val_score(clf, iris.data, iris.target, cv=custom_cv)
array([1. , 0.973...])

.. topic:: Data transformation with held out data

Just as it is important to test a predictor on data held-out from
Expand Down
4 changes: 2 additions & 2 deletions examples/model_selection/plot_learning_curve.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,8 +53,8 @@ def plot_learning_curve(estimator, title, X, y, ylim=None, cv=None,
Possible inputs for cv are:
- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs, if ``y`` is binary or multiclass,
:class:`StratifiedKFold` used. If the estimator is not a classifier
Expand Down
4 changes: 2 additions & 2 deletions sklearn/calibration.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,8 @@ class CalibratedClassifierCV(BaseEstimator, ClassifierMixin):

- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs, if ``y`` is binary or multiclass,
:class:`sklearn.model_selection.StratifiedKFold` is used. If ``y`` is
Expand Down
8 changes: 4 additions & 4 deletions sklearn/covariance/graph_lasso_.py
Original file line number Diff line number Diff line change
Expand Up @@ -491,8 +491,8 @@ class GraphicalLassoCV(GraphicalLasso):

- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs :class:`KFold` is used.

Expand Down Expand Up @@ -897,8 +897,8 @@ class GraphLassoCV(GraphicalLassoCV):

- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs :class:`KFold` is used.

Expand Down
4 changes: 2 additions & 2 deletions sklearn/feature_selection/rfe.py
Original file line number Diff line number Diff line change
Expand Up @@ -356,8 +356,8 @@ class RFECV(RFE, MetaEstimatorMixin):

- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs, if ``y`` is binary or multiclass,
:class:`sklearn.model_selection.StratifiedKFold` is used. If the
Expand Down
16 changes: 8 additions & 8 deletions sklearn/linear_model/coordinate_descent.py
Original file line number Diff line number Diff line change
Expand Up @@ -1307,8 +1307,8 @@ class LassoCV(LinearModelCV, RegressorMixin):

- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs, :class:`KFold` is used.

Expand Down Expand Up @@ -1477,8 +1477,8 @@ class ElasticNetCV(LinearModelCV, RegressorMixin):

- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs, :class:`KFold` is used.

Expand Down Expand Up @@ -2015,8 +2015,8 @@ class MultiTaskElasticNetCV(LinearModelCV, RegressorMixin):

- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs, :class:`KFold` is used.

Expand Down Expand Up @@ -2194,8 +2194,8 @@ class MultiTaskLassoCV(LinearModelCV, RegressorMixin):

- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs, :class:`KFold` is used.

Expand Down
8 changes: 4 additions & 4 deletions sklearn/linear_model/least_angle.py
Original file line number Diff line number Diff line change
Expand Up @@ -1007,8 +1007,8 @@ class LarsCV(Lars):

- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs, :class:`KFold` is used.

Expand Down Expand Up @@ -1230,8 +1230,8 @@ class LassoLarsCV(LarsCV):

- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs, :class:`KFold` is used.

Expand Down
4 changes: 2 additions & 2 deletions sklearn/linear_model/omp.py
Original file line number Diff line number Diff line change
Expand Up @@ -790,8 +790,8 @@ class OrthogonalMatchingPursuitCV(LinearModel, RegressorMixin):

- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs, :class:`KFold` is used.

Expand Down
8 changes: 4 additions & 4 deletions sklearn/linear_model/ridge.py
Original file line number Diff line number Diff line change
Expand Up @@ -1211,8 +1211,8 @@ class RidgeCV(_BaseRidgeCV, RegressorMixin):

- None, to use the efficient Leave-One-Out cross-validation
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs, if ``y`` is binary or multiclass,
:class:`sklearn.model_selection.StratifiedKFold` is used, else,
Expand Down Expand Up @@ -1323,8 +1323,8 @@ class RidgeClassifierCV(LinearClassifierMixin, _BaseRidgeCV):

- None, to use the efficient Leave-One-Out cross-validation
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

Refer :ref:`User Guide <cross_validation>` for the various
cross-validation strategies that can be used here.
Expand Down
8 changes: 4 additions & 4 deletions sklearn/model_selection/_search.py
Original file line number Diff line number Diff line change
Expand Up @@ -901,8 +901,8 @@ class GridSearchCV(BaseSearchCV):

- None, to use the default 3-fold cross validation,
- integer, to specify the number of folds in a `(Stratified)KFold`,
- An object to be used as a cross-validation generator.
- An iterable yielding train, test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs, if the estimator is a classifier and ``y`` is
either binary or multiclass, :class:`StratifiedKFold` is used. In all
Expand Down Expand Up @@ -1235,8 +1235,8 @@ class RandomizedSearchCV(BaseSearchCV):

- None, to use the default 3-fold cross validation,
- integer, to specify the number of folds in a `(Stratified)KFold`,
- An object to be used as a cross-validation generator.
- An iterable yielding train, test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs, if the estimator is a classifier and ``y`` is
either binary or multiclass, :class:`StratifiedKFold` is used. In all
Expand Down
4 changes: 2 additions & 2 deletions sklearn/model_selection/_split.py
Original file line number Diff line number Diff line change
Expand Up @@ -1913,8 +1913,8 @@ def check_cv(cv='warn', y=None, classifier=False):

- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs, if classifier is True and ``y`` is either
binary or multiclass, :class:`StratifiedKFold` is used. In all other
Expand Down
24 changes: 12 additions & 12 deletions sklearn/model_selection/_validation.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,8 +82,8 @@ def cross_validate(estimator, X, y=None, groups=None, scoring=None, cv='warn',

- None, to use the default 3-fold cross validation,
- integer, to specify the number of folds in a `(Stratified)KFold`,
- An object to be used as a cross-validation generator.
- An iterable yielding train, test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs, if the estimator is a classifier and ``y`` is
either binary or multiclass, :class:`StratifiedKFold` is used. In all
Expand Down Expand Up @@ -292,8 +292,8 @@ def cross_val_score(estimator, X, y=None, groups=None, scoring=None, cv='warn',

- None, to use the default 3-fold cross validation,
- integer, to specify the number of folds in a `(Stratified)KFold`,
- An object to be used as a cross-validation generator.
- An iterable yielding train, test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs, if the estimator is a classifier and ``y`` is
either binary or multiclass, :class:`StratifiedKFold` is used. In all
Expand Down Expand Up @@ -666,8 +666,8 @@ def cross_val_predict(estimator, X, y=None, groups=None, cv='warn',

- None, to use the default 3-fold cross validation,
- integer, to specify the number of folds in a `(Stratified)KFold`,
- An object to be used as a cross-validation generator.
- An iterable yielding train, test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs, if the estimator is a classifier and ``y`` is
either binary or multiclass, :class:`StratifiedKFold` is used. In all
Expand Down Expand Up @@ -957,8 +957,8 @@ def permutation_test_score(estimator, X, y, groups=None, cv='warn',

- None, to use the default 3-fold cross validation,
- integer, to specify the number of folds in a `(Stratified)KFold`,
- An object to be used as a cross-validation generator.
- An iterable yielding train, test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs, if the estimator is a classifier and ``y`` is
either binary or multiclass, :class:`StratifiedKFold` is used. In all
Expand Down Expand Up @@ -1109,8 +1109,8 @@ def learning_curve(estimator, X, y, groups=None,

- None, to use the default 3-fold cross validation,
- integer, to specify the number of folds in a `(Stratified)KFold`,
- An object to be used as a cross-validation generator.
- An iterable yielding train, test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs, if the estimator is a classifier and ``y`` is
either binary or multiclass, :class:`StratifiedKFold` is used. In all
Expand Down Expand Up @@ -1360,8 +1360,8 @@ def validation_curve(estimator, X, y, param_name, param_range, groups=None,

- None, to use the default 3-fold cross validation,
- integer, to specify the number of folds in a `(Stratified)KFold`,
- An object to be used as a cross-validation generator.
- An iterable yielding train, test splits.
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

For integer/None inputs, if the estimator is a classifier and ``y`` is
either binary or multiclass, :class:`StratifiedKFold` is used. In all
Expand Down
12 changes: 6 additions & 6 deletions sklearn/multioutput.py
Original file line number Diff line number Diff line change
Expand Up @@ -511,9 +511,9 @@ class ClassifierChain(_BaseChain, ClassifierMixin, MetaEstimatorMixin):
If cv is None the true labels are used when fitting. Otherwise
possible inputs for cv are:

* integer, to specify the number of folds in a (Stratified)KFold,
* An object to be used as a cross-validation generator.
* An iterable yielding train, test splits.
- integer, to specify the number of folds in a (Stratified)KFold,
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

random_state : int, RandomState instance or None, optional (default=None)
If int, random_state is the seed used by the random number generator;
Expand Down Expand Up @@ -667,9 +667,9 @@ class RegressorChain(_BaseChain, RegressorMixin, MetaEstimatorMixin):
If cv is None the true labels are used when fitting. Otherwise
possible inputs for cv are:

* integer, to specify the number of folds in a (Stratified)KFold,
* An object to be used as a cross-validation generator.
* An iterable yielding train, test splits.
- integer, to specify the number of folds in a (Stratified)KFold,
- :term:`CV splitter`,
- An iterable yielding (train, test) splits as arrays of indices.

random_state : int, RandomState instance or None, optional (default=None)
If int, random_state is the seed used by the random number generator;
Expand Down