Skip to content

sample_weight feature in LinearSVC (BaseLibLinear/SparseBaseLibLinear)) #409

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
aabbi opened this issue Oct 20, 2011 · 11 comments
Closed

sample_weight feature in LinearSVC (BaseLibLinear/SparseBaseLibLinear)) #409

aabbi opened this issue Oct 20, 2011 · 11 comments

Comments

@aabbi
Copy link

aabbi commented Oct 20, 2011

Currently, there does not seem to be a way to specify sample_weights when using svm.LinearSVC or svm.sparse.LinearSVC.
Can that feature be implemented? or Is there an existing way to do that?
libsvm contains the feature in http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/#weights_for_data_instances

@agramfort
Copy link
Member

SVC(kernel='linear') supports sample weights just in case ...

@aabbi
Copy link
Author

aabbi commented Oct 20, 2011

Yes, but I am working on sparse data and I wish to use the decision_function method. Since that method is not implemented for svm.sparse.SVC..i have to use LinearSVC...is there any other way to overcome this?

@GaelVaroquaux
Copy link
Member

LinearSVC does not rely on libsvm, but on liblinear. Thus it would be necessary to add support for sample weights to liblinear, which is a sizeable amount of work.

@aabbi
Copy link
Author

aabbi commented Oct 20, 2011

Well, according to the following link
http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/#weights_for_data_instances
liblinear also has that feature .. can that be somehow used?

@larsmans
Copy link
Member

No, that's the LibSVM docs. To repeat what Gael said, LinearSVC is built on top of LibLinear, by the same authors.

@aabbi
Copy link
Author

aabbi commented Oct 25, 2011

Okay. Thanks.

@aabbi aabbi closed this as completed Oct 25, 2011
@jnothman
Copy link
Member

There does seem to be a LIBLINEAR fork supporting sample weights... http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/#weights_for_data_instances

Supporting sample_weight in LogisticRegression may also be helpful.

I just noticed the same comment was made by @aabbi, but I don't understand why it was rejected. The code there is indeed a modified version of LibLinear.

@jnothman
Copy link
Member

Seeing as scikit-learn has adopted the nonstandard LibSVM with sample weights, I assume the changes to LIBLINEAR should indeed be merged, so I'm reopening this Issue (and am willing to be shot down for doing so!). The differences can be viewed at http://github.com/jnothman/scikit-learn/tree/liblinear-sample_weight (particularly jnothman@54ec588). I have based this on the commit before Lars imported liblinear 0.91; there was no record of the diff from the downloaded LIBLINEAR 0.91 to that commit, which should have involved a branch and merge. I haven't (yet) modified our set_problem etc. nor merged in the later changes to linear.cpp, and am more than happy for someone else to do so :P

@jnothman jnothman reopened this May 29, 2013
@GaelVaroquaux
Copy link
Member

We already have class weights. You want to add sample weights in extra,
right?

Should the class_weights be then refactored to use the sample_weights? It
would be possible and might make the code simpler.

The trick business here will be to make really sure that all the patches
applied to liblinear are kept.

@jnothman
Copy link
Member

You want to add sample weights in extra, right?

For symmetry with LibSVM if nothing else.

Should the class_weights be then refactored to use the sample_weights?

It looks like that liblinear fork supports both independently.

The trick business here will be to make really sure that all the patches applied to liblinear are kept.

I've tried the merge that should apply them; there are 15 blocks with conflicts in linear.cpp and that's it. "Making really sure" hopefully is a matter of tests passing :s

@amueller
Copy link
Member

fixed at some point since then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants