-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
LinearSVC ignores sample weights #10873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
curious. I can't see how this would come about from the code... but
LinearSVC was tacked onto #5274 which supported logistic regression
firstly. LinearSVC weighting was not tested :(. @MechCoder, do you recall
whether we could have not properly patched LinearSVC?
|
sorry, my mistaken forensics. LinearSVC weights were only added in #6939.
but the tests aren't sufficient: they don't check that weighting does
anything. I suspect you're right and something is missing here.
|
Hi, It's an old issue, but it still persists in sklearn version 0.20.1 The alternative (for SVM-based models) is to run SVC with a linear kernel, however it is much slower. Thanks! |
Support for the sample weights in the primal problem seems to be easy, see the patch attached. Dual problem is not that straightforward. |
@melnikovsky do you want to send a PR with a test? |
Don't we have common tests for sample weights? This seems really bad :-/ |
The fix in #15018 is right. I also think that we should raise a |
liblinear/libsvm provides a version supporting sample_weight: We should indeed synchronize our code to these versions for the sample_weight support |
@glemaitre I have encountered the same issue, LinearSVC ignores the sample_weight, which should be fixed as in libsvm https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/#weights_for_data_instances. |
I made a Python library to solve the SVM with sample weight, based on the same algorithm with LibLinear, see https://github.com/statmlben/Variant-SVM. |
Hi,
Description
It appears LinearSVC ignores (or suppresses) sample weights, and the model remains the same regardless of the sample weight input.
This can be demonstrated when comparing a LinearSVC model to an SVC model with a linear kernel.
Steps/Code to Reproduce
Extension of the example in:
http://scikit-learn.org/stable/auto_examples/svm/plot_weighted_samples.html#sphx-glr-auto-examples-svm-plot-weighted-samples-py)
Results
In the 4 plots, you can see that the SVC with the linear kernel is affected by the sample weight, while the LinearSVC model is not.
Versions
Windows-10-10.0.16299-SP0
Python 3.5.2 |Anaconda 4.2.0 (64-bit)| (default, Jul 5 2016, 11:41:13) [MSC v.1900 64 bit (AMD64)]
NumPy 1.14.0
SciPy 1.0.0
Scikit-Learn 0.19.1
Thanks!
The text was updated successfully, but these errors were encountered: