Skip to content

FIX Correct the definition of gamma=scale in svm #13221

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 24, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions doc/modules/model_evaluation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ Usage examples:
>>> clf = svm.SVC(gamma='scale', random_state=0)
>>> cross_val_score(clf, X, y, scoring='recall_macro',
... cv=5) # doctest: +ELLIPSIS, +NORMALIZE_WHITESPACE
array([0.96..., 1. ..., 0.96..., 0.96..., 1. ])
array([0.96..., 0.96..., 0.96..., 0.93..., 1. ])
>>> model = svm.SVC()
>>> cross_val_score(model, X, y, cv=5, scoring='wrong_choice')
Traceback (most recent call last):
Expand Down Expand Up @@ -1947,7 +1947,7 @@ change the kernel::

>>> clf = SVC(gamma='scale', kernel='rbf', C=1).fit(X_train, y_train)
>>> clf.score(X_test, y_test) # doctest: +ELLIPSIS
0.97...
0.94...

We see that the accuracy was boosted to almost 100%. A cross validation
strategy is recommended for a better estimate of the accuracy, if it
Expand Down
2 changes: 1 addition & 1 deletion doc/tutorial/basic/tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -344,7 +344,7 @@ once will overwrite what was learned by any previous ``fit()``::
max_iter=-1, probability=False, random_state=None, shrinking=True,
tol=0.001, verbose=False)
>>> clf.predict(X_test)
array([1, 0, 1, 1, 0])
array([0, 0, 0, 1, 0])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these samples are generated randomly (maybe we should avoid doing so in the tutorial?)


Here, the default kernel ``rbf`` is first changed to ``linear`` via
:func:`SVC.set_params()<sklearn.svm.SVC.set_params>` after the estimator has
Expand Down
22 changes: 16 additions & 6 deletions doc/whats_new/v0.20.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,14 @@ Changelog
threaded when `n_jobs > 1` or `n_jobs = -1`.
:issue:`13005` by :user:`Prabakaran Kumaresshan <nixphix>`.

:mod:`sklearn.feature_extraction`
.................................

- |Fix| Fixed a bug in :class:`feature_extraction.text.CountVectorizer` which
would result in the sparse feature matrix having conflicting `indptr` and
`indices` precisions under very large vocabularies. :issue:`11295` by
:user:`Gabriel Vacaliuc <gvacaliuc>`.

:mod:`sklearn.impute`
.....................

Expand Down Expand Up @@ -68,13 +76,15 @@ Changelog
with a warning in :class:`preprocessing.KBinsDiscretizer`.
:issue:`13165` by :user:`Hanmin Qin <qinhanmin2014>`.

:mod:`sklearn.feature_extraction.text`
......................................
:mod:`sklearn.svm`
..................

- |Fix| Fixed a bug in :class:`feature_extraction.text.CountVectorizer` which
would result in the sparse feature matrix having conflicting `indptr` and
`indices` precisions under very large vocabularies. :issue:`11295` by
:user:`Gabriel Vacaliuc <gvacaliuc>`.
- |FIX| Fixed a bug in :class:`svm.SVC`, :class:`svm.NuSVC`, :class:`svm.SVR`,
:class:`svm.NuSVR` and :class:`svm.OneClassSVM` where the ``scale`` option
of parameter ``gamma`` is erroneously defined as
``1 / (n_features * X.std())``. It's now defined as
``1 / (n_features * X.var())``.
:issue:`13221` by :user:`Hanmin Qin <qinhanmin2014>`.

.. _changes_0_20_2:

Expand Down
12 changes: 8 additions & 4 deletions sklearn/model_selection/tests/test_search.py
Original file line number Diff line number Diff line change
Expand Up @@ -1710,19 +1710,23 @@ def test_deprecated_grid_search_iid():
depr_message = ("The default of the `iid` parameter will change from True "
"to False in version 0.22")
X, y = make_blobs(n_samples=54, random_state=0, centers=2)
grid = GridSearchCV(SVC(gamma='scale'), param_grid={'C': [1]}, cv=3)
grid = GridSearchCV(SVC(gamma='scale', random_state=0),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not always warn.

if self.iid == 'warn':
warn = False
for scorer_name in scorers.keys():
scores = test_scores[scorer_name].reshape(n_candidates,
n_splits)
means_weighted = np.average(scores, axis=1,
weights=test_sample_counts)
means_unweighted = np.average(scores, axis=1)
if not np.allclose(means_weighted, means_unweighted,
rtol=1e-4, atol=1e-4):
warn = True
break
if warn:
warnings.warn("The default of the `iid` parameter will change "
"from True to False in version 0.22 and will be"
" removed in 0.24. This will change numeric"
" results when test-set sizes are unequal.",
DeprecationWarning)

param_grid={'C': [10]}, cv=3)
# no warning with equally sized test sets
assert_no_warnings(grid.fit, X, y)

grid = GridSearchCV(SVC(gamma='scale'), param_grid={'C': [1]}, cv=5)
grid = GridSearchCV(SVC(gamma='scale', random_state=0),
param_grid={'C': [10]}, cv=5)
# warning because 54 % 5 != 0
assert_warns_message(DeprecationWarning, depr_message, grid.fit, X, y)

grid = GridSearchCV(SVC(gamma='scale'), param_grid={'C': [1]}, cv=2)
grid = GridSearchCV(SVC(gamma='scale', random_state=0),
param_grid={'C': [10]}, cv=2)
# warning because stratification into two classes and 27 % 2 != 0
assert_warns_message(DeprecationWarning, depr_message, grid.fit, X, y)

grid = GridSearchCV(SVC(gamma='scale'), param_grid={'C': [1]}, cv=KFold(2))
grid = GridSearchCV(SVC(gamma='scale', random_state=0),
param_grid={'C': [10]}, cv=KFold(2))
# no warning because no stratification and 54 % 2 == 0
assert_no_warnings(grid.fit, X, y)

Expand Down
12 changes: 6 additions & 6 deletions sklearn/svm/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -169,19 +169,19 @@ def fit(self, X, y, sample_weight=None):

if self.gamma in ('scale', 'auto_deprecated'):
if sparse:
# std = sqrt(E[X^2] - E[X]^2)
X_std = np.sqrt((X.multiply(X)).mean() - (X.mean())**2)
# var = E[X^2] - E[X]^2
X_var = (X.multiply(X)).mean() - (X.mean()) ** 2
else:
X_std = X.std()
X_var = X.var()
if self.gamma == 'scale':
if X_std != 0:
self._gamma = 1.0 / (X.shape[1] * X_std)
if X_var != 0:
self._gamma = 1.0 / (X.shape[1] * X_var)
else:
self._gamma = 1.0
else:
kernel_uses_gamma = (not callable(self.kernel) and self.kernel
not in ('linear', 'precomputed'))
if kernel_uses_gamma and not np.isclose(X_std, 1.0):
if kernel_uses_gamma and not np.isclose(X_var, 1.0):
# NOTE: when deprecation ends we need to remove explicitly
# setting `gamma` in examples (also in tests). See
# https://github.com/scikit-learn/scikit-learn/pull/10331
Expand Down
10 changes: 5 additions & 5 deletions sklearn/svm/classes.py
Original file line number Diff line number Diff line change
Expand Up @@ -463,7 +463,7 @@ class SVC(BaseSVC):
Kernel coefficient for 'rbf', 'poly' and 'sigmoid'.

Current default is 'auto' which uses 1 / n_features,
if ``gamma='scale'`` is passed then it uses 1 / (n_features * X.std())
if ``gamma='scale'`` is passed then it uses 1 / (n_features * X.var())
as value of gamma. The current default of gamma, 'auto', will change
to 'scale' in version 0.22. 'auto_deprecated', a deprecated version of
'auto' is used as a default indicating that no explicit value of gamma
Expand Down Expand Up @@ -651,7 +651,7 @@ class NuSVC(BaseSVC):
Kernel coefficient for 'rbf', 'poly' and 'sigmoid'.

Current default is 'auto' which uses 1 / n_features,
if ``gamma='scale'`` is passed then it uses 1 / (n_features * X.std())
if ``gamma='scale'`` is passed then it uses 1 / (n_features * X.var())
as value of gamma. The current default of gamma, 'auto', will change
to 'scale' in version 0.22. 'auto_deprecated', a deprecated version of
'auto' is used as a default indicating that no explicit value of gamma
Expand Down Expand Up @@ -812,7 +812,7 @@ class SVR(BaseLibSVM, RegressorMixin):
Kernel coefficient for 'rbf', 'poly' and 'sigmoid'.

Current default is 'auto' which uses 1 / n_features,
if ``gamma='scale'`` is passed then it uses 1 / (n_features * X.std())
if ``gamma='scale'`` is passed then it uses 1 / (n_features * X.var())
as value of gamma. The current default of gamma, 'auto', will change
to 'scale' in version 0.22. 'auto_deprecated', a deprecated version of
'auto' is used as a default indicating that no explicit value of gamma
Expand Down Expand Up @@ -948,7 +948,7 @@ class NuSVR(BaseLibSVM, RegressorMixin):
Kernel coefficient for 'rbf', 'poly' and 'sigmoid'.

Current default is 'auto' which uses 1 / n_features,
if ``gamma='scale'`` is passed then it uses 1 / (n_features * X.std())
if ``gamma='scale'`` is passed then it uses 1 / (n_features * X.var())
as value of gamma. The current default of gamma, 'auto', will change
to 'scale' in version 0.22. 'auto_deprecated', a deprecated version of
'auto' is used as a default indicating that no explicit value of gamma
Expand Down Expand Up @@ -1065,7 +1065,7 @@ class OneClassSVM(BaseLibSVM, OutlierMixin):
Kernel coefficient for 'rbf', 'poly' and 'sigmoid'.

Current default is 'auto' which uses 1 / n_features,
if ``gamma='scale'`` is passed then it uses 1 / (n_features * X.std())
if ``gamma='scale'`` is passed then it uses 1 / (n_features * X.var())
as value of gamma. The current default of gamma, 'auto', will change
to 'scale' in version 0.22. 'auto_deprecated', a deprecated version of
'auto' is used as a default indicating that no explicit value of gamma
Expand Down
8 changes: 4 additions & 4 deletions sklearn/svm/tests/test_sparse.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,9 +87,9 @@ def test_svc():
kernels = ["linear", "poly", "rbf", "sigmoid"]
for dataset in datasets:
for kernel in kernels:
clf = svm.SVC(gamma='scale', kernel=kernel, probability=True,
clf = svm.SVC(gamma=1, kernel=kernel, probability=True,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

numerical issue here and below

X = np.array([[-2, -1], [-1, -1], [-1, -2], [1, 1], [1, 2], [2, 1]])
X_sp = sparse.lil_matrix(X)
Y = [1, 1, 1, 2, 2, 2]
clf = svm.SVC(gamma='scale', kernel=kernel, probability=True,
              random_state=0, decision_function_shape='ovo')
sp_clf = svm.SVC(gamma='scale', kernel=kernel, probability=True,
                 random_state=0, decision_function_shape='ovo')
clf.fit(X, Y)
print(clf._gamma)
print(clf.support_)
sp_clf.fit(X_sp, Y)
print(sp_clf._gamma)
print(sp_clf.support_)
# 0.25
# [0 1 3 4]
# 0.25000000000000006
# [1 2 3 5]

random_state=0, decision_function_shape='ovo')
sp_clf = svm.SVC(gamma='scale', kernel=kernel, probability=True,
sp_clf = svm.SVC(gamma=1, kernel=kernel, probability=True,
random_state=0, decision_function_shape='ovo')
check_svm_model_equal(clf, sp_clf, *dataset)

Expand Down Expand Up @@ -293,8 +293,8 @@ def test_sparse_oneclasssvm(datasets_index, kernel):
[X_blobs[:80], None, X_blobs[80:]],
[iris.data, None, iris.data]]
dataset = datasets[datasets_index]
clf = svm.OneClassSVM(gamma='scale', kernel=kernel)
sp_clf = svm.OneClassSVM(gamma='scale', kernel=kernel)
clf = svm.OneClassSVM(gamma=1, kernel=kernel)
sp_clf = svm.OneClassSVM(gamma=1, kernel=kernel)
check_svm_model_equal(clf, sp_clf, *dataset)


Expand Down
10 changes: 5 additions & 5 deletions sklearn/svm/tests/test_svm.py
Original file line number Diff line number Diff line change
Expand Up @@ -243,11 +243,11 @@ def test_oneclass():
clf.fit(X)
pred = clf.predict(T)

assert_array_equal(pred, [-1, -1, -1])
assert_array_equal(pred, [1, -1, -1])
assert_equal(pred.dtype, np.dtype('intp'))
assert_array_almost_equal(clf.intercept_, [-1.117], decimal=3)
assert_array_almost_equal(clf.intercept_, [-1.218], decimal=3)
assert_array_almost_equal(clf.dual_coef_,
[[0.681, 0.139, 0.68, 0.14, 0.68, 0.68]],
[[0.750, 0.750, 0.750, 0.750]],
decimal=3)
assert_raises(AttributeError, lambda: clf.coef_)

Expand Down Expand Up @@ -1003,9 +1003,9 @@ def test_gamma_scale():

clf = svm.SVC(gamma='scale')
assert_no_warnings(clf.fit, X, y)
assert_equal(clf._gamma, 2.)
assert_almost_equal(clf._gamma, 4)

# X_std ~= 1 shouldn't raise warning, for when
# X_var ~= 1 shouldn't raise warning, for when
# gamma is not explicitly set.
X, y = [[1, 2], [3, 2 * np.sqrt(6) / 3 + 2]], [0, 1]
assert_no_warnings(clf.fit, X, y)