FEA add quantile HGBT #21800

lorentzenchr · 2021-11-26T22:55:15Z

Reference Issues/PRs

Follow-up of #20567 and #20811. Solves #17955.

~~It is based on top of #20811, so it should be merged after that PR.~~ Edit: merged.

What does this implement/fix? Explain your changes.

This PR adds loss="quantile" to HistGradientBoostingRegressor.

Any other comments?

This also introduces the new parameter loss_param for specifying which quantile to model. This is in anticipation of possible other losses with one parameter as the Tweedie deviance.

thomasjpfan · 2021-11-26T23:12:53Z

~~Should we do the loss migration first and then add quantile loss in a follow up?~~ I do not want an API discussion around loss_param or tests for quantile loss to block the loss migration.

I suspect we would want to benchmark the new loss implementations and compare it to main in this PR.

Edit: I just saw that #20811 does the loss migrations. I guess we want to merge that one first and then work on this one?

lorentzenchr · 2021-12-11T09:09:36Z

Marking as high priority according to https://scikit-learn.fondation-inria.fr/technical-committee-june-2-2021-fr/.

lorentzenchr · 2022-01-16T20:26:23Z

After merging #20811 and git rebase, this is an enjoyable short PR for a new feature!

thomasjpfan

Thanks for the update!

thomasjpfan · 2022-01-18T18:24:38Z

sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py

+    loss_param : float, default=None
+        If loss is "quantile", this parameter specifies which quantile to be estimated
+        and must be between 0 and 1.


I see a few paths forward with passing in more parameters for loss:

If we want to be fully generic, then I think loss_param could become a dictionary, which will be passed down into the loss directly. This way, other parameterized losses can be generically supported by this dictionary. We kind of already do this with metric_kwargs.

We can copy GradientBoostingRegressor's API and use a single parameter (alpha) that is just for quantile loss. In HistGradientBoosting* case, I would call it quantile instead of alpha.

(Likely out of reach in the near-term) Loss become public and users can pass create a PinballLoss(quantile=0.3) and pass it directly into HistGradient*.

I'd like to add the option:

loss_param is single parameter with different meaning depending on the loss, i.e. a quantile level for loss="quantile", and tweedie power for loss="tweedie" (in a later PR).

Comments:

loss_param dictionary
This makes the clear distinction which loss is used with which parameter. As the loss parameter should not be used for hyperparameter tuning, this is ok.

Special parameter per loss.
If we add further losses, this would clutter the API.

Pass a loss object instance
To not do this and instead use simple parameter types like string and float was an original design decision for scikit-learn. It would, however, make live sometimes easier (meaning, I'm not opposed to this direction).

General: I really like consistency and would favor to use/adapt the choice we make here to other estimators, e.g. GradientBoostingRegressor, maybe one day DecisionTreeRegressor as well.

loss_param is single parameter with different meaning depending on the loss, i.e. a quantile level for loss="quantile", and tweedie power for loss="tweedie" (in a later PR).

I was thinking of losses with more than one parameter. Although, I think supporting multiple parameters can be a considered a non-issue. For advanced users, they can create their own loss object and pass it in. Kind of like how we unofficially support passing in a Criterion object into the trees:

scikit-learn/sklearn/tree/_classes.py

Lines 360 to 363 in e1e8c66

else:

# Make a deepcopy in case the criterion has mutable attributes that

# might be shared and modified concurrently during parallel fitting

criterion = copy.deepcopy(criterion)

As a starting point, I am okay with a single loss_param. I think the API discussion is worth bringing up during the next dev meeting.

I'm not sure we already have a design similar to the proposed loss_param in scikit-learn, do we?

Do we already have a clear idea of what future losses we want to support and what parameters they would need, apart from Tweedie? If we don't, using loss_param instead of the more consistent and simpler quantile parameter might be solving a problem we don't really have.

one reason to go towards loss_param be a dictionary is to support losses with more than two parameters. Although traditional losses e.g. epsilon insensitive loss, huber loss have also only one param maybe this can be a restriction in the future?

I'm not sure we already have a design similar to the proposed loss_param in scikit-learn, do we?

Similar to this is param in GenericUnivariateSelect but it's always seemed quite ugly to me!

From the dev meeting, I think the most agreeable and quickest path forward is to change loss_param toquantile and only accept a single float.

ogrisel

LGTM.

Please consider including lorentzenchr#2 if you like it.

Let's wait for the decision at the dev meeting w.r.t. loss_param. For the record I am fine with the way it is unless we already know of a loss that we plan to add and that would naturally require more than 1 parameter, in which case we can go for the dict option directly.

Note that we already have neighbors models that accept constructor params defined as metric="minknowski", metric_params={"p": 1.5, "w": [0.2, 0.8]} for instance.

ogrisel · 2022-01-26T08:56:59Z

@lorentzenchr brainstorming for a potential follow-up PR: do you think it would be possible to have a quantile loss that could predict multiple quantiles at once (similarly to loss multinomial that can predict one output per possible class)?.

That would still require having many n times more trees in the ensemble when predicting n quantiles instead of 1 (as for multinomial) but that could be helpful from a usability standpoint.

In a follow-up, follow-up PR we could even extend the capabilities of the individual HGBR trees so as to predict n outputs per tree directly and have the split select the highest improvement in the aggregate loss directly. This would have a different inductive bias but could be computationally much more efficient, both for multi-quantile regression and for multi-class classification (and we could also support mult-label in a similar way).

lorentzenchr · 2022-01-31T22:33:17Z

do you think it would be possible to have a quantile loss that could predict multiple quantiles at once?

@ogrisel For the (pinball) loss itself, there is no gain (that I see) in enabling several quantile levels at the same time other than user convenience: One pinball loss is tailored for only one quantile level.
For quantiles, it happens to work that the losses for, say, 10 quantile levels can be combined into a single loss as the sum of the 10 corresponding pinball losses. But again, I see no advantage (apart from an additional metric for 2 symmetric quantile levels, alpha/2 and 1-alpha/2, for user convenience).
Combining several quantile levels seems more promising for linear losses, where one could forbid crossings (via inequality constraints in the linprog solver).

In a follow-up, follow-up PR we could even extend the capabilities of the individual HGBR trees so as to predict n outputs per tree directly and have the split select the highest improvement in the aggregate loss directly.

My first thought is that a single tree/split for all quantiles could be too simplistic for the whole predictive distribution. The problem is the weighting: should we weight the median more than the 99% quantile or the other way around. It works, however, for quantile regression forests, see Meinshausen 2006. I have to think about it more deeply.

agramfort · 2022-02-01T09:32:42Z

From the dev meeting, I think the most agreeable and quickest path forward is to change loss_param toquantile and only accept a single float.

works for me

…

Message ID: ***@***.*** com>

lorentzenchr · 2022-02-01T17:34:08Z

@ogrisel Do you stand by your approval?

lorentzenchr · 2022-02-01T17:43:09Z

@agramfort Thanks for your review. It is good to know that there are other people interested in quantile regression 😄

agramfort · 2022-02-01T17:49:43Z

you're not alone :)

…

Message ID: ***@***.***>

thomasjpfan

Thanks for the update!

sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py

doc/whats_new/v1.1.rst

sklearn/_loss/loss.py

sklearn/ensemble/_hist_gradient_boosting/tests/test_gradient_boosting.py

This reverts commit 539c3a2.

lorentzenchr · 2022-02-09T13:16:03Z

Status report: 2 approvals, all lights green and no comments left.

thomasjpfan

LGTM!

@ogrisel Your approval was when we still had loss_param. Are you okay with using the quantile instead?

github-actions bot added cython module:ensemble labels Nov 26, 2021

lorentzenchr removed the cython label Nov 26, 2021

github-actions bot added the cython label Nov 26, 2021

lorentzenchr mentioned this pull request Nov 26, 2021

[DOC] Speed up plot_gradient_boosting_quantile.py example #21666

Merged

lorentzenchr removed the cython label Nov 26, 2021

lorentzenchr linked an issue Dec 10, 2021 that may be closed by this pull request

HistGradientBoostingRegressor and quantile loss function #17955

Closed

lorentzenchr added the High Priority High priority issues and pull requests label Dec 10, 2021

lorentzenchr mentioned this pull request Dec 11, 2021

ENH Replace loss module HGBT #20811

Merged

lorentzenchr added 3 commits January 16, 2022 21:20

ENH add quantile to HGBT

d61d2f4

TST add test for quantile HGBT

57505de

DOC add whatsnew

66fb17e

lorentzenchr force-pushed the hgbt_quantile branch from a8bed32 to 57505de Compare January 16, 2022 20:23

lorentzenchr added the Waiting for Reviewer label Jan 16, 2022

thomasjpfan reviewed Jan 18, 2022

View reviewed changes

ogrisel self-requested a review January 25, 2022 15:22

ogrisel approved these changes Jan 25, 2022

View reviewed changes

ogrisel mentioned this pull request Jan 26, 2022

Add simpler test for quantile HGBDT lorentzenchr/scikit-learn#2

Closed

CLN rename loss_param to quantile

b68e9a0

agramfort approved these changes Feb 1, 2022

View reviewed changes

thomasjpfan reviewed Feb 1, 2022

View reviewed changes

sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py Outdated Show resolved Hide resolved

sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py Show resolved Hide resolved

lorentzenchr added 2 commits February 4, 2022 17:26

CLN quantile only for HGBRegressor

8e3089e

TST invalid init parameters for losses

539c3a2

thomasjpfan reviewed Feb 7, 2022

View reviewed changes

lorentzenchr mentioned this pull request Feb 7, 2022

TST invalid init parameters for losses #22407

Merged

lorentzenchr added 2 commits February 7, 2022 19:00

Revert "TST invalid init parameters for losses"

c081b3b

This reverts commit 539c3a2.

CLN apply review comments

686c445

thomasjpfan reviewed Feb 11, 2022

View reviewed changes

thomasjpfan approved these changes Feb 11, 2022

View reviewed changes

ogrisel approved these changes Feb 22, 2022

View reviewed changes

ogrisel merged commit 5ad3421 into scikit-learn:main Feb 22, 2022

lorentzenchr deleted the hgbt_quantile branch February 22, 2022 17:00

thomasjpfan pushed a commit to thomasjpfan/scikit-learn that referenced this pull request Mar 1, 2022

FEA add quantile HGBT (scikit-learn#21800)

99f6292

lorentzenchr mentioned this pull request Mar 30, 2022

New feature: Quantile HGBT scikit-learn/communication#11

Open

This was referenced Nov 15, 2022

Update scikit learn 1.2 automl/auto-sklearn#1611

Closed

[Research] Use parameter quantile in HistGradientBoostingRegressor automl/auto-sklearn#1613

Open

	else:
	# Make a deepcopy in case the criterion has mutable attributes that
	# might be shared and modified concurrently during parallel fitting
	criterion = copy.deepcopy(criterion)

Uh oh!

FEA add quantile HGBT #21800

FEA add quantile HGBT #21800

Uh oh!

Conversation

lorentzenchr commented Nov 26, 2021 • edited by ogrisel Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

thomasjpfan commented Nov 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lorentzenchr commented Dec 11, 2021

Uh oh!

lorentzenchr commented Jan 16, 2022

Uh oh!

thomasjpfan left a comment

Choose a reason for hiding this comment

Uh oh!

thomasjpfan Jan 18, 2022

Choose a reason for hiding this comment

Uh oh!

lorentzenchr Jan 20, 2022

Choose a reason for hiding this comment

Uh oh!

thomasjpfan Jan 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NicolasHug Jan 31, 2022

Choose a reason for hiding this comment

Uh oh!

agramfort Jan 31, 2022

Choose a reason for hiding this comment

Uh oh!

jnothman Jan 31, 2022

Choose a reason for hiding this comment

Uh oh!

thomasjpfan Jan 31, 2022

Choose a reason for hiding this comment

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

ogrisel commented Jan 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lorentzenchr commented Jan 31, 2022

Uh oh!

agramfort commented Feb 1, 2022 via email

Uh oh!

lorentzenchr commented Feb 1, 2022

Uh oh!

lorentzenchr commented Feb 1, 2022

Uh oh!

agramfort commented Feb 1, 2022 via email

Uh oh!

thomasjpfan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lorentzenchr commented Feb 9, 2022

Uh oh!

thomasjpfan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lorentzenchr commented Nov 26, 2021 •

edited by ogrisel

Loading

thomasjpfan commented Nov 26, 2021 •

edited

Loading

thomasjpfan Jan 21, 2022 •

edited

Loading

ogrisel commented Jan 26, 2022 •

edited

Loading