-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[FEAT] Implement quantile SVR #23153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
LGTM analysis fails due to version bump in PR #22674. Should I bump it in |
* implement sparse version of quantile SVR
Re-based onto latest main and fixed unit tests. It looks like the LGTM issue has been fixed in the meantime, yay! But the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, this is very interesting. Given the fact that this new estimator can be implemented with very little new code, I think it's worth considering inclusion in scikit-learn.
To make the PR easier to review, could you please change your PR to avoid changing the lines of svm.cpp that are not related to the topic of the PR (e.g. trailing spaces).
For the example, it would be interesting to compare several methods:
- nonlinear feature engineer with SplineTransformer (+Nystroem?) followed by QuantileRegressor
- quantile SVR with a non-linear kernel
- tree based quantile methods (e.g. gradient boosted trees)
and have a conluding paragraph that gives some pros and cons of each method.
Also, to avoid example profileration, maybe it would be worth expanding the existing example:
by adding new sections at the end.
About the CI timeout, it's unfortunate and we need to debug this but it might be transient, so feel free to push a new empty commit to your PR to re-trigger the CI when this happens (and/or merge main
to make sure that if/when we fix the problem in main
, your PR can benefit from the fix).
/cc @lorentzenchr
assert_allclose(np.linalg.norm(qvr.coef_), np.linalg.norm(svr.coef_), 1, 0.0001) | ||
assert_almost_equal(score1, score2, 2) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add tests that checks that it converges to the expected results (e.g. by measuring the pinball loss on the training set and checking that it's small) for different values of the quantile parameter, for instance on synthetic data on a dataset with a fixed repeated values and a known distribution of Y|X.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I will get to this soon! Thank you so much for reviewing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I added several additional tests (that I admit to have shamelessly stolen from QuantileRegressor
. They test that the quantiles are well calibrated with any kernel, that illegal inputs raise appropriate errors, and, finally, that the results of (linear) QSVR give approximately the same result as a Nelder-Mead minimization of the pinball loss.
doc/whats_new/v1.1.rst
Outdated
@@ -1131,6 +1131,10 @@ Changelog | |||
:mod:`sklearn.svm` | |||
.................. | |||
|
|||
- |Feature| :class:`svm.QuantileSVR` implements quantile regression with support vector | |||
machines as derived in Hwang, C., Shim, J. (2005), DOI: 10.1007/11539087_66 | |||
:pr:`23153` by :user:`Alexander Trettin <atrettin>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will need to be moved to v1.2.rst (sorry for the slow feedback...).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
Thank you for having a look at this, @ogrisel ! I will look at the example and perhaps merge it with the example you linked, and add some more unit tests (where I might just shamelessly copy a few things from the |
sklearn/svm/_base.py
Outdated
@@ -722,6 +726,7 @@ def __init__( | |||
C=C, | |||
nu=nu, | |||
epsilon=0.0, | |||
quantile=0.0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to pass quantile=None
in models for which the quantile parameter (and actually not pass it in the call to super().__init__
when left to the default value) in the Python level API of the classes?
Actually scratch that. I re-read how the base class BaseLibSVM
is organized and it does not use kwarg in its __init__
.
It's a bit unfortunate that BaseLibSVM
responsible to set the public attributes. It would be necessary to refactor it to avoid setting the parameters that are unused in the concrete classes (e.g. SVC.quantile
, SVC.nu
and so on). However, I have the feeling that this refactoring should be done in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that you point it out, I'm not sure why I put this into BaseSVC, since that should be the base class for classifiers, and QSVR is not a classifier. I'll check if I can get rid of it in this place. Other than that, the module is just built in this somewhat odd way in which the base classes anticipate the arguments of all possible sub-classes. epsilon=0.0
is also passed every time, even though not every SVM uses it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably because the libsvm C++ wrapper functions (libsvm.fit
/ libsvm_sparse.libsvm_sparse_train
) expect a fixed list of mandatory arguments and only the base class is calling the wrapper internal API.
We could change it to pass getattr(self, "quantile", 0.0)
instead of self.quantile
and only set the self.quantile
attribute in the __init__
of the concrete QuantileSVR
subclass. This way this attribute would not be set on unrelated classes such as SVC
.
We could fix other sub-class specific attributes (e.g. self.nu
) in a similar way but I would rather like to keep this PR focused on quantile regression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I did what you suggested and used getattr
with a default value when calling the libsvm function, and removed the quantile attribute from the base classes.
* test that the quantiles are properly calibrated * test that illegal inputs result in appropriate errors * test that (linear) QSVR is equivalent to pinball loss minimization
I think this leaves only the better example on the to-do list, but I won't be able to get around to that today. This is plenty of material to review already. :) |
What is implemented?
This PR implements quantile regression using support-vector machines.
Mathematically, this applies the "kernel trick" to an L2 regularized linear regression that minimizes the "pinball-loss". For linear kernels and without regularization, this would give the same result as the already existing
QuantileRegressor
(see #9978). The dual problem is derived in Hwang et al. (2005).Implementation-wise, the algorithm is only a slight modification of epsilon-SVR. In fact, when the quantile is set to 0.5, the regression is exactly equivalent to an epsilon-SVR where epsilon is set to zero! Thanks to this very close similarity, only very few changes are needed w.r.t. epsilon-SVR to make it work. The efficiency is the same as that of epsilon-SVR (for better or worse).
Why is this useful?
Scikit-learn already contains a quantile regressor, but it is restricted to solving linear problems. Although these restrictions can partially be alleviated by using transformers into polynomial features or B-Splines, it would be far more useful, especially when dealing with more than one dimension, if one could apply the kernel trick. In addition, L2 regularization is probably more desirable than L1 regularization for most regression problems.
The QuantileSVR regressor can be used to estimate prediction intervals for non-linear functions as shown in the example (see example code)

References
Hwang, C., Shim, J. (2005). A Simple Quantile Regression via Support Vector Machine. In: Wang, L., Chen, K., Ong, Y.S. (eds) Advances in Natural Computation. ICNC 2005. Lecture Notes in Computer Science, vol 3610. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11539087_66