-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
[MRG] Update default test_size of ShuffleSplit for 0.21 #13483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Update default test_size of ShuffleSplit for 0.21 #13483
Conversation
The not doing validation in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good but I think we need new tests in test_train_test_split
to cover the new behavior when train_size
is provided but test_size
is left to it's default value.
# simple test
split = train_test_split(X, y, test_size=None, train_size=.5)
X_train, X_test, y_train, y_test = split
assert_equal(len(y_test), len(y_train))
should be changed to:
# simple test
split = train_test_split(X, y, train_size=.5)
X_train, X_test, y_train, y_test = split
assert_equal(len(y_test), len(y_train))
and probably something similar for ShuffleSplit
/ StratifiedShuffleSplit
.
if train_size is not None and train_size_type not in ('i', 'f'): | ||
raise ValueError("Invalid value for train_size: {}".format(train_size)) | ||
if test_size is not None and test_size_type not in ('i', 'f'): | ||
raise ValueError("Invalid value for test_size: {}".format(test_size)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically it should be a TypeError
but I guess it's too late to change now as it was already raising a ValueError
for this case in released versions of scikit-learn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I guess so...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found another problem:
I added tests for the new default value of |
@jeremiedbb the circle ci status come from your account instead of scikit-learn's. Therefore I cannot restart those to check if the sphinx-gallery error can go away with a rebuild. |
I changed it and reran the CI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jeremiedbb!
…rn#13483) completing deprecation
…ikit-learn#13483)" This reverts commit a805e13.
…ikit-learn#13483)" This reverts commit a805e13.
…rn#13483) completing deprecation
Complement of #13443 in order to clean the deprecations for 0.21
I moved the
_split
part a this separate PR because the change is not straightforward and needs more work.I've set
test_size
default value to None, which will be interpreted as the default value (depending on the estimator) whentrain_size
is also None.