Skip to content

[FEAT] Implement quantile SVR #23153

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

atrettin
Copy link

@atrettin atrettin commented Apr 18, 2022

What is implemented?

This PR implements quantile regression using support-vector machines.

Mathematically, this applies the "kernel trick" to an L2 regularized linear regression that minimizes the "pinball-loss". For linear kernels and without regularization, this would give the same result as the already existing QuantileRegressor (see #9978). The dual problem is derived in Hwang et al. (2005).

Implementation-wise, the algorithm is only a slight modification of epsilon-SVR. In fact, when the quantile is set to 0.5, the regression is exactly equivalent to an epsilon-SVR where epsilon is set to zero! Thanks to this very close similarity, only very few changes are needed w.r.t. epsilon-SVR to make it work. The efficiency is the same as that of epsilon-SVR (for better or worse).

Why is this useful?

Scikit-learn already contains a quantile regressor, but it is restricted to solving linear problems. Although these restrictions can partially be alleviated by using transformers into polynomial features or B-Splines, it would be far more useful, especially when dealing with more than one dimension, if one could apply the kernel trick. In addition, L2 regularization is probably more desirable than L1 regularization for most regression problems.

The QuantileSVR regressor can be used to estimate prediction intervals for non-linear functions as shown in the example (see example code)
quantile_svr

References

Hwang, C., Shim, J. (2005). A Simple Quantile Regression via Support Vector Machine. In: Wang, L., Chen, K., Ong, Y.S. (eds) Advances in Natural Computation. ICNC 2005. Lecture Notes in Computer Science, vol 3610. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11539087_66

@atrettin atrettin marked this pull request as draft April 18, 2022 10:39
@atrettin atrettin changed the title Implement quantile SVR [WIP] [FEAT] Implement quantile SVR Apr 18, 2022
@atrettin
Copy link
Author

LGTM analysis fails due to version bump in PR #22674. Should I bump it in lgtm.yml?

@atrettin atrettin changed the title [WIP] [FEAT] Implement quantile SVR [FEAT] Implement quantile SVR Apr 19, 2022
@atrettin atrettin marked this pull request as ready for review April 19, 2022 11:56
@atrettin
Copy link
Author

Re-based onto latest main and fixed unit tests. It looks like the LGTM issue has been fixed in the meantime, yay! But the doc-min-dependencies workflow just timed out for no discernible reason, depriving me of the green check mark. :( Anyways, should anyone be interested in this regressor, I would appreciate feedback!

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, this is very interesting. Given the fact that this new estimator can be implemented with very little new code, I think it's worth considering inclusion in scikit-learn.

To make the PR easier to review, could you please change your PR to avoid changing the lines of svm.cpp that are not related to the topic of the PR (e.g. trailing spaces).

For the example, it would be interesting to compare several methods:

  • nonlinear feature engineer with SplineTransformer (+Nystroem?) followed by QuantileRegressor
  • quantile SVR with a non-linear kernel
  • tree based quantile methods (e.g. gradient boosted trees)

and have a conluding paragraph that gives some pros and cons of each method.

Also, to avoid example profileration, maybe it would be worth expanding the existing example:

https://scikit-learn.org/dev/auto_examples/ensemble/plot_gradient_boosting_quantile.html#sphx-glr-auto-examples-ensemble-plot-gradient-boosting-quantile-py

by adding new sections at the end.

About the CI timeout, it's unfortunate and we need to debug this but it might be transient, so feel free to push a new empty commit to your PR to re-trigger the CI when this happens (and/or merge main to make sure that if/when we fix the problem in main, your PR can benefit from the fix).

/cc @lorentzenchr

assert_allclose(np.linalg.norm(qvr.coef_), np.linalg.norm(svr.coef_), 1, 0.0001)
assert_almost_equal(score1, score2, 2)


Copy link
Member

@ogrisel ogrisel Jul 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add tests that checks that it converges to the expected results (e.g. by measuring the pinball loss on the training set and checking that it's small) for different values of the quantile parameter, for instance on synthetic data on a dataset with a fixed repeated values and a known distribution of Y|X.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I will get to this soon! Thank you so much for reviewing!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I added several additional tests (that I admit to have shamelessly stolen from QuantileRegressor. They test that the quantiles are well calibrated with any kernel, that illegal inputs raise appropriate errors, and, finally, that the results of (linear) QSVR give approximately the same result as a Nelder-Mead minimization of the pinball loss.

@@ -1131,6 +1131,10 @@ Changelog
:mod:`sklearn.svm`
..................

- |Feature| :class:`svm.QuantileSVR` implements quantile regression with support vector
machines as derived in Hwang, C., Shim, J. (2005), DOI: 10.1007/11539087_66
:pr:`23153` by :user:`Alexander Trettin <atrettin>`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need to be moved to v1.2.rst (sorry for the slow feedback...).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@atrettin
Copy link
Author

atrettin commented Jul 5, 2022

Thank you for having a look at this, @ogrisel ! I will look at the example and perhaps merge it with the example you linked, and add some more unit tests (where I might just shamelessly copy a few things from the QuantileRegressor, since it checks some the same things).

@@ -722,6 +726,7 @@ def __init__(
C=C,
nu=nu,
epsilon=0.0,
quantile=0.0,
Copy link
Member

@ogrisel ogrisel Jul 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to pass quantile=None in models for which the quantile parameter (and actually not pass it in the call to super().__init__ when left to the default value) in the Python level API of the classes?

Actually scratch that. I re-read how the base class BaseLibSVM is organized and it does not use kwarg in its __init__.

It's a bit unfortunate that BaseLibSVM responsible to set the public attributes. It would be necessary to refactor it to avoid setting the parameters that are unused in the concrete classes (e.g. SVC.quantile, SVC.nu and so on). However, I have the feeling that this refactoring should be done in another PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that you point it out, I'm not sure why I put this into BaseSVC, since that should be the base class for classifiers, and QSVR is not a classifier. I'll check if I can get rid of it in this place. Other than that, the module is just built in this somewhat odd way in which the base classes anticipate the arguments of all possible sub-classes. epsilon=0.0 is also passed every time, even though not every SVM uses it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably because the libsvm C++ wrapper functions (libsvm.fit / libsvm_sparse.libsvm_sparse_train) expect a fixed list of mandatory arguments and only the base class is calling the wrapper internal API.

We could change it to pass getattr(self, "quantile", 0.0) instead of self.quantile and only set the self.quantile attribute in the __init__ of the concrete QuantileSVR subclass. This way this attribute would not be set on unrelated classes such as SVC.

We could fix other sub-class specific attributes (e.g. self.nu) in a similar way but I would rather like to keep this PR focused on quantile regression.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I did what you suggested and used getattr with a default value when calling the libsvm function, and removed the quantile attribute from the base classes.

@atrettin
Copy link
Author

I think this leaves only the better example on the to-do list, but I won't be able to get around to that today. This is plenty of material to review already. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants