Skip to content

[MRG+1] DOC Make cv documentation consistent across our codebase #5238

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 13, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 15 additions & 4 deletions sklearn/calibration.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,10 +53,21 @@ class CalibratedClassifierCV(BaseEstimator, ClassifierMixin):
with too few calibration samples (<<1000) since it tends to overfit.
Use sigmoids (Platt's calibration) in this case.

cv : integer or cross-validation generator or "prefit", optional
If an integer is passed, it is the number of folds (default 3).
Specific cross-validation objects can be passed, see
sklearn.cross_validation module for the list of possible objects.
cv : integer/cross-validation generator/iterable or "prefit", optional
Determines the cross-validation splitting strategy.
Possible inputs for cv are:
- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.

For integer/None inputs, if ``y`` is binary or multiclass,
:class:`StratifiedKFold` used. If ``y`` is neither binary nor
multiclass, :class:`KFold` is used.

Refer :ref:`User Guide <cross_validation>` for the various
cross-validation strategies that can be used here.

If "prefit" is passed, it is assumed that base_estimator has been
fitted already and all data is used for calibration.

Expand Down
15 changes: 12 additions & 3 deletions sklearn/covariance/graph_lasso_.py
Original file line number Diff line number Diff line change
Expand Up @@ -463,9 +463,18 @@ class GraphLassoCV(GraphLasso):
The number of times the grid is refined. Not used if explicit
values of alphas are passed.

cv : cross-validation generator, optional
see sklearn.cross_validation module. If None is passed, defaults to
a 3-fold strategy
cv : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
Possible inputs for cv are:
- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.

For integer/None inputs :class:`KFold` is used.

Refer :ref:`User Guide <cross_validation>` for the various
cross-validation strategies that can be used here.

tol: positive float, optional
The tolerance to declare convergence: if the dual gap goes below
Expand Down
84 changes: 56 additions & 28 deletions sklearn/cross_validation.py
Original file line number Diff line number Diff line change
Expand Up @@ -1202,15 +1202,20 @@ def cross_val_predict(estimator, X, y=None, cv=None, n_jobs=1,
The target variable to try to predict in the case of
supervised learning.

cv : integer or cross-validation generator, optional, default=3
A cross-validation generator to use. If int, determines the number
of folds in StratifiedKFold if estimator is a classifier and the
target y is binary or multiclass, or the number of folds in KFold
otherwise.
Specific cross-validation objects can be passed, see
sklearn.cross_validation module for the list of possible objects.
This generator must include all elements in the test set exactly once.
Otherwise, a ValueError is raised.
cv : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
Possible inputs for cv are:
- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.

For integer/None inputs, if ``y`` is binary or multiclass,
:class:`StratifiedKFold` used. If the estimator is a classifier
or if ``y`` is neither binary nor multiclass, :class:`KFold` is used.

Refer :ref:`User Guide <cross_validation>` for the various
cross-validation strategies that can be used here.

n_jobs : integer, optional
The number of CPUs to use to do the computation. -1 means
Expand Down Expand Up @@ -1371,13 +1376,20 @@ def cross_val_score(estimator, X, y=None, scoring=None, cv=None, n_jobs=1,
a scorer callable object / function with signature
``scorer(estimator, X, y)``.

cv : integer or cross-validation generator, optional, default=3
A cross-validation generator to use. If int, determines the number
of folds in StratifiedKFold if estimator is a classifier and the
target y is binary or multiclass, or the number of folds in KFold
otherwise.
Specific cross-validation objects can be passed, see
sklearn.cross_validation module for the list of possible objects.
cv : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
Possible inputs for cv are:
- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.

For integer/None inputs, if ``y`` is binary or multiclass,
:class:`StratifiedKFold` used. If the estimator is a classifier
or if ``y`` is neither binary nor multiclass, :class:`KFold` is used.

Refer :ref:`User Guide <cross_validation>` for the various
cross-validation strategies that can be used here.

n_jobs : integer, optional
The number of CPUs to use to do the computation. -1 means
Expand Down Expand Up @@ -1628,11 +1640,20 @@ def check_cv(cv, X=None, y=None, classifier=False):

Parameters
----------
cv : int, a cv generator instance, or None
The input specifying which cv generator to use. It can be an
integer, in which case it is the number of folds in a KFold,
None, in which case 3 fold is used, or another object, that
will then be used as a cv generator.
cv : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
Possible inputs for cv are:
- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.

For integer/None inputs, if ``y`` is binary or multiclass,
:class:`StratifiedKFold` used. If the estimator is a classifier
or if ``y`` is neither binary nor multiclass, :class:`KFold` is used.

Refer :ref:`User Guide <cross_validation>` for the various
cross-validation strategies that can be used here.

X : array-like
The data the cross-val object will be applied on.
Expand Down Expand Up @@ -1692,13 +1713,20 @@ def permutation_test_score(estimator, X, y, cv=None,
a scorer callable object / function with signature
``scorer(estimator, X, y)``.

cv : integer or cross-validation generator, optional, default=3
A cross-validation generator to use. If int, determines the number
of folds in StratifiedKFold if estimator is a classifier and the
target y is binary or multiclass, or the number of folds in KFold
otherwise.
Specific cross-validation objects can be passed, see
sklearn.cross_validation module for the list of possible objects.
cv : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
Possible inputs for cv are:
- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.

For integer/None inputs, if ``y`` is binary or multiclass,
:class:`StratifiedKFold` used. If the estimator is a classifier
or if ``y`` is neither binary nor multiclass, :class:`KFold` is used.

Refer :ref:`User Guide <cross_validation>` for the various
cross-validation strategies that can be used here.

n_permutations : integer, optional
Number of times to permute ``y``.
Expand Down
19 changes: 14 additions & 5 deletions sklearn/feature_selection/rfe.py
Original file line number Diff line number Diff line change
Expand Up @@ -290,11 +290,20 @@ class RFECV(RFE, MetaEstimatorMixin):
If within (0.0, 1.0), then `step` corresponds to the percentage
(rounded down) of features to remove at each iteration.

cv : int or cross-validation generator, optional (default=None)
If int, it is the number of folds.
If None, 3-fold cross-validation is performed by default.
Specific cross-validation objects can also be passed, see
`sklearn.cross_validation module` for details.
cv : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
Possible inputs for cv are:
- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.

For integer/None inputs, if ``y`` is binary or multiclass,
:class:`StratifiedKFold` used. If the estimator is a classifier
or if ``y`` is neither binary nor multiclass, :class:`KFold` is used.

Refer :ref:`User Guide <cross_validation>` for the various
cross-validation strategies that can be used here.

scoring : string, callable or None, optional, default: None
A string (see model evaluation documentation) or
Expand Down
42 changes: 28 additions & 14 deletions sklearn/grid_search.py
Original file line number Diff line number Diff line change
Expand Up @@ -656,13 +656,20 @@ class GridSearchCV(BaseSearchCV):
the folds, and the loss minimized is the total loss per sample,
and not the mean loss across the folds.

cv : integer or cross-validation generator, default=3
A cross-validation generator to use. If int, determines
the number of folds in StratifiedKFold if estimator is a classifier
and the target y is binary or multiclass, or the number
of folds in KFold otherwise.
Specific cross-validation objects can be passed, see
sklearn.cross_validation module for the list of possible objects.
cv : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
Possible inputs for cv are:
- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.

For integer/None inputs, if ``y`` is binary or multiclass,
:class:`StratifiedKFold` used. If the estimator is a classifier
or if ``y`` is neither binary nor multiclass, :class:`KFold` is used.

Refer :ref:`User Guide <cross_validation>` for the various
cross-validation strategies that can be used here.

refit : boolean, default=True
Refit the best estimator with the entire dataset.
Expand Down Expand Up @@ -850,13 +857,20 @@ class RandomizedSearchCV(BaseSearchCV):
the folds, and the loss minimized is the total loss per sample,
and not the mean loss across the folds.

cv : integer or cross-validation generator, optional
A cross-validation generator to use. If int, determines
the number of folds in StratifiedKFold if estimator is a classifier
and the target y is binary or multiclass, or the number
of folds in KFold otherwise.
Specific cross-validation objects can be passed, see
sklearn.cross_validation module for the list of possible objects.
cv : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
Possible inputs for cv are:
- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.

For integer/None inputs, if ``y`` is binary or multiclass,
:class:`StratifiedKFold` used. If the estimator is a classifier
or if ``y`` is neither binary nor multiclass, :class:`KFold` is used.

Refer :ref:`User Guide <cross_validation>` for the various
cross-validation strategies that can be used here.

refit : boolean, default=True
Refit the best estimator with the entire dataset.
Expand Down
39 changes: 28 additions & 11 deletions sklearn/learning_curve.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,13 +59,20 @@ def learning_curve(estimator, X, y, train_sizes=np.linspace(0.1, 1.0, 5),
be big enough to contain at least one sample from each class.
(default: np.linspace(0.1, 1.0, 5))

cv : integer or cross-validation generator, optional, default=3
A cross-validation generator to use. If int, determines the number
of folds in StratifiedKFold if estimator is a classifier and the
target y is binary or multiclass, or the number of folds in KFold
otherwise.
Specific cross-validation objects can be passed, see
sklearn.cross_validation module for the list of possible objects.
cv : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
Possible inputs for cv are:
- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.

For integer/None inputs, if ``y`` is binary or multiclass,
:class:`StratifiedKFold` used. If the estimator is a classifier
or if ``y`` is neither binary nor multiclass, :class:`KFold` is used.

Refer :ref:`User Guide <cross_validation>` for the various
cross-validation strategies that can be used here.

scoring : string, callable or None, optional, default: None
A string (see model evaluation documentation) or
Expand Down Expand Up @@ -264,10 +271,20 @@ def validation_curve(estimator, X, y, param_name, param_range, cv=None,
param_range : array-like, shape (n_values,)
The values of the parameter that will be evaluated.

cv : integer, cross-validation generator, optional
If an integer is passed, it is the number of folds (defaults to 3).
Specific cross-validation objects can be passed, see
sklearn.cross_validation module for the list of possible objects
cv : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
Possible inputs for cv are:
- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.

For integer/None inputs, if ``y`` is binary or multiclass,
:class:`StratifiedKFold` used. If the estimator is a classifier
or if ``y`` is neither binary nor multiclass, :class:`KFold` is used.

Refer :ref:`User Guide <cross_validation>` for the various
cross-validation strategies that can be used here.

scoring : string, callable or None, optional, default: None
A string (see model evaluation documentation) or
Expand Down
68 changes: 48 additions & 20 deletions sklearn/linear_model/coordinate_descent.py
Original file line number Diff line number Diff line change
Expand Up @@ -1221,11 +1221,18 @@ class LassoCV(LinearModelCV, RegressorMixin):
dual gap for optimality and continues until it is smaller
than ``tol``.

cv : integer or cross-validation generator, optional
If an integer is passed, it is the number of fold (default 3).
Specific cross-validation objects can be passed, see the
:mod:`sklearn.cross_validation` module for the list of possible
objects.
cv : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
Possible inputs for cv are:
- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.

For integer/None inputs, :class:`KFold` is used.

Refer :ref:`User Guide <cross_validation>` for the various
cross-validation strategies that can be used here.

verbose : bool or integer
Amount of verbosity.
Expand Down Expand Up @@ -1360,11 +1367,18 @@ class ElasticNetCV(LinearModelCV, RegressorMixin):
dual gap for optimality and continues until it is smaller
than ``tol``.

cv : integer or cross-validation generator, optional
If an integer is passed, it is the number of fold (default 3).
Specific cross-validation objects can be passed, see the
:mod:`sklearn.cross_validation` module for the list of possible
objects.
cv : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
Possible inputs for cv are:
- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.

For integer/None inputs, :class:`KFold` is used.

Refer :ref:`User Guide <cross_validation>` for the various
cross-validation strategies that can be used here.

verbose : bool or integer
Amount of verbosity.
Expand Down Expand Up @@ -1835,11 +1849,18 @@ class MultiTaskElasticNetCV(LinearModelCV, RegressorMixin):
dual gap for optimality and continues until it is smaller
than ``tol``.

cv : integer or cross-validation generator, optional
If an integer is passed, it is the number of fold (default 3).
Specific cross-validation objects can be passed, see the
:mod:`sklearn.cross_validation` module for the list of possible
objects.
cv : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
Possible inputs for cv are:
- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.

For integer/None inputs, :class:`KFold` is used.

Refer :ref:`User Guide <cross_validation>` for the various
cross-validation strategies that can be used here.

verbose : bool or integer
Amount of verbosity.
Expand Down Expand Up @@ -1985,11 +2006,18 @@ class MultiTaskLassoCV(LinearModelCV, RegressorMixin):
dual gap for optimality and continues until it is smaller
than ``tol``.

cv : integer or cross-validation generator, optional
If an integer is passed, it is the number of fold (default 3).
Specific cross-validation objects can be passed, see the
:mod:`sklearn.cross_validation` module for the list of possible
objects.
cv : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
Possible inputs for cv are:
- None, to use the default 3-fold cross-validation,
- integer, to specify the number of folds.
- An object to be used as a cross-validation generator.
- An iterable yielding train/test splits.

For integer/None inputs, :class:`KFold` is used.

Refer :ref:`User Guide <cross_validation>` for the various
cross-validation strategies that can be used here.

verbose : bool or integer
Amount of verbosity.
Expand Down
Loading