-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[WIP] Loss Module: replace in logistic regression and hgbt #19089
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Loss Module: replace in logistic regression and hgbt #19089
Conversation
d924b4f
to
4e00738
Compare
Benchmarks of Fit TimeHardware: Intel Core i7-8559U, 8th generation, 16 GB RAM HistGradientBoostingClassifier
|
Thanks for the benchmarks! I only checked the HistGradientBoosting ones. Are these the times spent in We probably don't need to worry about Also, what's the reason for benchmarking with and without early stopping? Is it so that we get more loss computations, or is it because the loss results might change significantly enough to affect the early-stopping decision? It might be worth benchmarking OMP_NUM_THREADS=1 as well to get an idea of the intrinsic benefits of the new implem, without multi-threading |
Nice benchmarks! Overall improvements, and the losses are minor. |
@NicolasHug I updated the benchmarks to include a single-threaded case for HGBT. All times are fit times as now written in the header. |
It might make sense to only have logistic regression in this PR and have HGBT separately in #20811? |
@rth I'll open a new PR for |
Reference Issues/PRs
Solves #15123. Follow-up on #19088.
What does this implement/fix? Explain your changes.
The losses in
sklearn/ensemble/_hist_gradient_boosting
and insklearn.linear_model._logistic
are replace with the new losses from #19088.Any other comments?
Needs benchmarking. First rudimentary results indicate no regression.