-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[DOC] Speed up plot_gradient_boosting_quantile.py
example
#21666
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DOC] Speed up plot_gradient_boosting_quantile.py
example
#21666
Conversation
@lorentzenchr it'd be nice if we could reduce the time of this example much further, but I'm kinda out of ideas. |
@adrinjalali I don't know if you want to hear this one: Merge #20567 and follow-up PRs, then use Edit: Meanwhile, #20567 is merged and the follow-up for quantile HGBT is #21800. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes to gain speed are good. I would, however, put back the information on different optimal tree depth for the different quantiles.
# We observe that the search procedure identifies that deeper trees are needed | ||
# to get a good fit for the 5th percentile regressor. Deeper trees are more | ||
# expressive and less likely to underfit. | ||
# We observe that the hyper-parameters that were hand-tuned for the median | ||
# regressor are in the same range as the hyper-parameters suitable for the 5th | ||
# percentile regressor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not the same message anymore.
# This time, shallower trees are selected and lead to a more constant piecewise | ||
# and therefore more robust estimation of the 95th percentile. This is | ||
# beneficial as it avoids overfitting the large outliers of the log-normal | ||
# additive noise. | ||
# | ||
# We can confirm this intuition by displaying the predicted 90% confidence | ||
# interval comprised by the predictions of those two tuned quantile regressors: | ||
# the prediction of the upper 95th percentile has a much coarser shape than the | ||
# prediction of the lower 5th percentile: | ||
# The result shows that the hyper-parameters for the 95th percentile regressor | ||
# identified by the grid search are roughly in the same range as the hand- | ||
# tuned hyper-parameters for the median regressor and the hyper-parameters | ||
# identified by the grid search for the 5th percentile regressor. However, the | ||
# hyper-parameter grid searches did lead to an improved 90% confidence | ||
# interval which can be seen below: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change also loses some, in my opinion, valuable information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the issue here is that the original hyperparameter space didn't include 0.2 learning rate, and therefore the statement was true, and expanding on the learning rate space, makes the example faster, and also makes it actually choose shallow trees in the first place. So I'm not sure if the information provided was kinda true in the first place.
Or do you mean something else @lorentzenchr ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the last plot, it's visible that the lower (5%) quantile is much more fine grained than the upper (95%) quantile. This statement, however, is less obvious from the new tuned parameters. I would at least comment on the plot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the tree depth: yes, based on my results deeper trees are actually not needed to achieve a good fit for the 5th percentile regressor. This is why I changed the documentation.
Regarding the granularity of the 5th and 95th percentile: true, with my changes the documentation on this gets lost. I'll change this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll also try using HalvingRandomSearchCV
as suggested and will report back my results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marenwestermann Do you need any help or just more time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lorentzenchr Thank you for checking in! I was moving house in the meantime. :) I addressed the comments, let me know if you would like more changes.
…nn/scikit-learn into gradient-boosting-quantile
I was able to reduce the runtime to 23 seconds on my computer (from initially 78 seconds) using HalvingRandomSearchCV. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @marenwestermann
That's great. |
Unrelated, merging latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@marenwestermann Thank you for this work. Could you merge main and if CI is green I'll be happy to merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work and nice improvement!
…earn#21666) Co-authored-by: Maren Westermann <maren.westermann@free-now.com> Co-authored-by: Tom Dupré la Tour <tom.dupre-la-tour@m4x.org>
Co-authored-by: Maren Westermann <maren.westermann@free-now.com> Co-authored-by: Tom Dupré la Tour <tom.dupre-la-tour@m4x.org>
Reference Issues/PRs
Addresses #21598
What does this implement/fix? Explain your changes.
Speeds up
../examples/ensemble/plot_gradient_boosting_quantile.py
. On my computer the runtime improved from 78 seconds to 54 seconds. The section that makes this example slow is the grid search in the section "Tuning the hyper-parameters of the quantile regressors". I speeded up this section by changing and removing parameters in the grid search. I adjusted the documentation accordingly because the results now differ from the previous results. This module could be speeded up a lot by removing the grid search completely but I don't know if this is desirable.Any other comments?