[architectural suggestion] Move random number generator to initializer in model_selection._split

This suggestion concerns random shuffling in the new `model_selection` module.

I faced a challenge in the following set up. I do a grid search with CV, and I want the CV reshuffling to be consistent for each parameter I am looping through. Now it seems impossible to do with `model_selection.KFold`, as `copy.copy()` and `copy.deepcopy()` lead to an error when called in following sequence:

``` python
import copy
import sklearn
y = np.random.randn(100)
n_folds = 10
kf_ = sklearn.model_selection.KFold(n_folds=n_folds, shuffle=True)
kf = kf_.split(y)
for tr, ts in copy.copy(kf):
    print((ts))
```

copying it earlier does not make sense, as the RNG is initialized only during `kf_.split(y)` call.

One solution is to specify the seed for each shuffling fold. Another fundamental solution is to refactor `model_selection` and move `check_random_state(self.random_state)` as [here](https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/model_selection/_split.py#L410) to the `__init__` of the `_BaseKFold`, and then each `kf_.split(y)` will give consistently shuffled indices.
#### Versions

Python 3.5.1 (v3.5.1:37a07cee5969, Dec  5 2015, 21:12:44)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
NumPy 1.11.0
SciPy 0.17.0
Scikit-Learn 0.18.dev0


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[architectural suggestion] Move random number generator to initializer in model_selection._split #6726

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[architectural suggestion] Move random number generator to initializer in model_selection._split #6726

Description

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions