-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG+2] default gamma='auto' in SVC #10331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Conflicts: sklearn/model_selection/_search.py sklearn/model_selection/tests/test_search.py
Currently it raises an error:
|
I can fix that issue I mention above (that is not a problem). I'v one query, that should we use |
@amueller can you give a look over the PR, though I've few questions as well that I've mentioned. |
Locally tests pass. I am guessing that travis uses scipy version 0.13.3, which doesn't have the |
sklearn/svm/base.py
Outdated
if self.gamma == 'auto': | ||
if self.gamma == 'scale': | ||
if isinstance(X, sp.spmatrix): | ||
X_std = np.sqrt(X.power(2).mean() - (X.mean())**2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been able to figure out how to do this without the use of X.power
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could simply use X.toarray()**2
, but I don't think that is efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe X.multiply(X)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can there be a lil
matrix here? We called check_X_y
above with accept_sparse='csr'
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we maybe not deprecate auto
and just change the default? not sure, though....
sklearn/svm/base.py
Outdated
if self.gamma == 'auto': | ||
if self.gamma == 'scale': | ||
if isinstance(X, sp.spmatrix): | ||
X_std = np.sqrt(X.power(2).mean() - (X.mean())**2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can there be a lil
matrix here? We called check_X_y
above with accept_sparse='csr'
.
I think that is the intention here. That is to replace the default value not to actually remove 'auto' from possible gamma values. |
sklearn/svm/classes.py
Outdated
If gamma is 'auto' then 1/n_features will be used instead. | ||
If gamma is 'auto' then 1/n_features will be used. | ||
If gamma is 'scale' then 1/(n_features * X.std()) will be used. | ||
The current default 'auto' is deprecated in version 0.20 and will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the formulation here was what led me to think that 'auto'
is deprecated, i.e. will be removed. I would rather say that "The default value will change to 'scale' in 0.22". I'm not sure about a good way to say when this decision was made. I think we had "This warning was introduced in version X" somewhere.
It's tempting to just change the meaning of auto, isn't it? But as it
affects the model, we probably shouldn't...
|
I can't say if it is tempting, since I don't know the reason for making
this change. I mean, why is using gamma=1 / (n_features * X.std()) is
more apt choice? Can you please refer me to material explaining the
reason for why it performs better than gamma=1 / n_features.
And if it does lead to improvement in model, then why not? And reading
from the discussion on the issue @amueller (at that time) seemed quite
convinced of the benefits of this change in gamma value.
|
There's probably no literature on why this would be good. But the |
I had a look over "A Practical Guide to Support Vector Classification" (by Chih-Wei Hsu, Chih-Chung Chang, and Chih-Jen Lin), and it says that
So considering that, it would imply that in most of the cases scaling is mostly needed.
Yes, I totally understand this part. "Changing" of things is one of the things to upset users. Though I'll take your final word, if you still consider that the change is worth it. |
Yes, but people don't do it. Basically the new default helps people who forget to scale their data. |
That is right. Should I close the PR then? May be the corresponding issue as well? (I can't close the issue, I didn't open). |
What? No. Please go ahead with this. I would do the full deprecation cycle as you did. |
I would just clarify the wording on what exactly changes. |
I don't understand what the problem with CircleCI tests are. Can you
help me out?
|
I don't think the error is related to the PR. It raises
So possibly something wrong with the build configuration itself. |
@amueller any suggestion for the failing tests? |
Ignore it ;) |
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
…s, fixed the docstring parameter order
Reference Issues/PRs
Fixes #8361
Fixes #8535
What does this implement/fix? Explain your changes.
Deprecates the default SVC gamma parameter value of "auto", which is calculated as 1 / n_features, and introduces "scale", which is calculated as 1 / (n_features * X.std()).
Any other comments?
I couldn't fix all the doctests since I'm not sure how to run doctests using pytest (I asked on gitter, though haven't received a response), otherwise using
make test
will more time to see which docs need to be fixed.