-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
FIX Correct the definition of gamma=scale
in svm
#13221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -1710,19 +1710,23 @@ def test_deprecated_grid_search_iid(): | |||
depr_message = ("The default of the `iid` parameter will change from True " | |||
"to False in version 0.22") | |||
X, y = make_blobs(n_samples=54, random_state=0, centers=2) | |||
grid = GridSearchCV(SVC(gamma='scale'), param_grid={'C': [1]}, cv=3) | |||
grid = GridSearchCV(SVC(gamma='scale', random_state=0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not always warn.
scikit-learn/sklearn/model_selection/_search.py
Lines 795 to 813 in d19a5dc
if self.iid == 'warn': | |
warn = False | |
for scorer_name in scorers.keys(): | |
scores = test_scores[scorer_name].reshape(n_candidates, | |
n_splits) | |
means_weighted = np.average(scores, axis=1, | |
weights=test_sample_counts) | |
means_unweighted = np.average(scores, axis=1) | |
if not np.allclose(means_weighted, means_unweighted, | |
rtol=1e-4, atol=1e-4): | |
warn = True | |
break | |
if warn: | |
warnings.warn("The default of the `iid` parameter will change " | |
"from True to False in version 0.22 and will be" | |
" removed in 0.24. This will change numeric" | |
" results when test-set sizes are unequal.", | |
DeprecationWarning) |
@@ -87,9 +87,9 @@ def test_svc(): | |||
kernels = ["linear", "poly", "rbf", "sigmoid"] | |||
for dataset in datasets: | |||
for kernel in kernels: | |||
clf = svm.SVC(gamma='scale', kernel=kernel, probability=True, | |||
clf = svm.SVC(gamma=1, kernel=kernel, probability=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numerical issue here and below
X = np.array([[-2, -1], [-1, -1], [-1, -2], [1, 1], [1, 2], [2, 1]])
X_sp = sparse.lil_matrix(X)
Y = [1, 1, 1, 2, 2, 2]
clf = svm.SVC(gamma='scale', kernel=kernel, probability=True,
random_state=0, decision_function_shape='ovo')
sp_clf = svm.SVC(gamma='scale', kernel=kernel, probability=True,
random_state=0, decision_function_shape='ovo')
clf.fit(X, Y)
print(clf._gamma)
print(clf.support_)
sp_clf.fit(X_sp, Y)
print(sp_clf._gamma)
print(sp_clf.support_)
# 0.25
# [0 1 3 4]
# 0.25000000000000006
# [1 2 3 5]
@@ -344,7 +344,7 @@ once will overwrite what was learned by any previous ``fit()``:: | |||
max_iter=-1, probability=False, random_state=None, shrinking=True, | |||
tol=0.001, verbose=False) | |||
>>> clf.predict(X_test) | |||
array([1, 0, 1, 1, 0]) | |||
array([0, 0, 0, 1, 0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these samples are generated randomly (maybe we should avoid doing so in the tutorial?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. But because gamma is now a fitted parameter, wouldn't it make sense to expose the estimated bandwidth as a public attribute gamma_
instead of a private attribute _gamma
?
I would be +0 for making it public.
lgtm (but github doesn't let me click approve?) |
Will soon open a PR to update the tutorial.
Also +0 from my side and this is out of the scope of this PR. Maybe open an issue/PR if someone wants it? |
…arn#13221)" This reverts commit 7d8830d.
…arn#13221)" This reverts commit 7d8830d.
Closes #12741
See #13186 (comment):
And I'm wondering whether it's possible to solve #12741 in 0.20.3 by changing the definition of
gamma='scale'
directly, since it's introduced erroneously in 0.20. This is an embarrassing mistake and I think it'll be much more difficult to solve it in 0.21.X (maybe at that time we'll need to deprecate scale and introduce another option).