Open
Description
It is important to stratify the samples according to y for cross-validation in regression models; otherwise, you might possibly get totally different ranges of y in training and validation sets. However, current StratifiedKFold
doesn't allow float:
$ x=sklearn.cross_validation.StratifiedKFold(np.random.random(9), 2)
/anaconda/envs/py3/lib/python3.4/site-packages/sklearn/cross_validation.py:417: Warning: The least populated class in y has only 1 members, which is too few. The minimum number of labels for any class cannot be less than n_folds=2.
% (min_labels, self.n_folds)), Warning)
$ list(x)
[(array([], dtype=int64), array([0, 1, 2, 3, 4, 5, 6, 7, 8])),
(array([0, 1, 2, 3, 4, 5, 6, 7, 8]), array([], dtype=int64))]
In case I may miss something, is there any reason why StratifiedKFold
does not work properly for float?