Skip to content

[MRG] ENH: Add SVDD to svm module #5899

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from
Closed

Conversation

nmayorov
Copy link
Contributor

Hi!

I noticed there was an interest to Support Vector Data Description algorithm, so I want to finish PR started by @sklef, which already contained a working version.

I rebased, did some cleanups and extended documentation.

Very interested in your reviews.

@albertcthomas
Copy link
Contributor

albertcthomas commented Nov 22, 2015

It may be a good thing to add to the documentation that the solutions of the SVDD and the OCSVM are identical when the Gaussian kernel is used for both methods with the same kernel bandwidth and C = 1/νN. This is mentioned in the original SVDD paper and in Tax thesis.

@nmayorov
Copy link
Contributor Author

@albertthomas88 Is it because with Gaussian kernel all vectors have unit norms? I noticed the fact of equivalence when all vectors are normalized, but didn't think how it applies to Gaussian kernel.

Also it explains why the results were so similar in "Novelty and Outlier Detection" example. I need to change it.

In this light I think it's better to use linear kernel by default in SVDD, otherwise we have two similar (by default) classes.

Thanks for brining this point.

Added: My interpretation wasn't quite correct. Two methods are equivalent when K(x, y) = K(x - y) and parameters C and nu have proper relation.

@albertcthomas
Copy link
Contributor

albertcthomas commented Nov 22, 2015

@nmayorov Yes the two methods are equivalent if you have unit norm vectors, which is the case with the Gaussian kernel because in the feature space you have norm(Phi(x))^2 = k(x,x) = 1.

More generally, the equivalence holds true if all vectors have the same norm, i.e., when k(x,x) is constant, which is the case if k(x,y) only depends on x-y. This is mentioned in the OCSVM paper by Scholkopf et aL.

@albertcthomas
Copy link
Contributor

I don't know if this is a problem to have two similar classes by default but the OCSVM and the SVDD are known to perform best when the Gaussian kernel is used.

@nmayorov
Copy link
Contributor Author

@albertthomas88

Yes the two methods are equivalent if you have unit norm vectors

Say we use linear kernel in 2-D space and all vectors have unit norms, but the decision boundary of SVDD is a circle and of OneClassSVM is a line. They can't be equivalent! Something is wrong here.

And the statement about K(x, y) = K(x - y) - I sort of took it on faith, but again this argument of the mismatch of the shapes of the boundaries troubles me. OK, we work in infinity-dimensional space, but can it make a plane become a sphere? Intuitive explanation might be that in the original space the decision boundaries become the same nevertheless, but it's a quite tricky thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants