Skip to content

ENH replace loss module Gradient boosting #26278

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
c3ffef9
ENH common losses in gradient boosting
lorentzenchr Sep 17, 2022
ffc0cdc
MNT replace _validate_y by _encode_y
lorentzenchr Mar 19, 2023
b250fd4
MNT rename _is_initialized to _is_fitted
lorentzenchr Mar 19, 2023
ba8f2a7
ENH use common loss function module in GB
lorentzenchr Mar 24, 2023
7d0145c
MNT fix backwards compatibility for oob_scores_ and oob_improvement_
lorentzenchr Apr 23, 2023
a8c73eb
FIX test_non_uniform_weights_toy_edge_case_reg
lorentzenchr Apr 23, 2023
2be5987
FIX _init_raw_predictions detection for predict_proba
lorentzenchr Apr 23, 2023
1495084
FIX huber GBT
lorentzenchr Apr 24, 2023
55f2448
FIX factor of losses for backward compat
lorentzenchr Apr 24, 2023
9f6cb2f
TST add tests for backward compat
lorentzenchr Apr 24, 2023
bfb76c9
FIX factor multinomial loss
lorentzenchr Apr 24, 2023
64098db
TST add test_binomial_vs_alternative_formulation
lorentzenchr Apr 24, 2023
966aa06
MNT remove _gb_losses.py and tests
lorentzenchr Apr 24, 2023
b6f4853
DOC add whatsnew entry
lorentzenchr Apr 24, 2023
b6191b6
Merge branch 'main' into gradient_boosting_common_loss
lorentzenchr Apr 24, 2023
5b5fc67
FIX don't use sample_weight in gradient
lorentzenchr Apr 26, 2023
d5ece9a
CLN remove comment left overs
lorentzenchr Apr 26, 2023
2521d2f
CLN add todo note on use of loss_gradient
lorentzenchr Apr 26, 2023
18933f7
Revert "TST add test_binomial_vs_alternative_formulation"
lorentzenchr Apr 29, 2023
30f91c7
TST reduce tol
lorentzenchr Apr 30, 2023
c80dc7f
FIX first check sample_weight then encode y
lorentzenchr Apr 30, 2023
72f7fb8
TST reduce tol
lorentzenchr May 1, 2023
b4c8587
CLN minor code cleanups in _update_terminal_regions
lorentzenchr May 1, 2023
c6cefe4
DOC cross_val_score in common pitfalls
lorentzenchr May 1, 2023
dabcba2
FIX use percentile from fixes
lorentzenchr May 1, 2023
c3656b8
TST adapt test_non_uniform_weights_toy_edge_case_reg
lorentzenchr May 1, 2023
36baff6
MNT do not unnecessarily copy raw_predictions
lorentzenchr May 1, 2023
53a3542
MNT remove X_csr and X_csc
lorentzenchr May 1, 2023
345d04f
CLN better comment
lorentzenchr May 1, 2023
cb9ebba
DOC list n_trees_per_iteration_ under Attributes
lorentzenchr May 1, 2023
fab68c7
TST skip_if_32bit test_huber_exact_backward_compat
lorentzenchr May 1, 2023
9461d1c
Merge branch 'main' into gradient_boosting_common_loss
lorentzenchr Jun 2, 2023
0c1c552
CLN expand comment
lorentzenchr Aug 1, 2023
8423e86
Merge branch 'main' into gradient_boosting_common_loss
lorentzenchr Aug 1, 2023
729a6d2
DOC mvoe whatsnew to 1.4
lorentzenchr Aug 1, 2023
9849990
CLN increase versionadded to 1.4
lorentzenchr Aug 1, 2023
bcdaf40
CLN make codecov happy
lorentzenchr Aug 1, 2023
40315f5
ENH add BaseLoss to param constraint and add tests
lorentzenchr Aug 1, 2023
3c4ad59
Merge branch 'main' into gradient_boosting_common_loss
lorentzenchr Aug 1, 2023
2f0a293
ENH _save_divide
lorentzenchr Aug 2, 2023
5c4dd9f
CLN neg_gradients outside of main loop
lorentzenchr Aug 2, 2023
2a9cdb8
CLN _encode_y and validation_loss review comments
lorentzenchr Aug 5, 2023
f486952
ENH handle gradient.ndim=2 as in HGBT
lorentzenchr Aug 5, 2023
de22dc6
CLN add TODO where to put in the learning rate
lorentzenchr Aug 5, 2023
16d212f
CLN correct docstring of _update_terminal_regions
lorentzenchr Aug 5, 2023
16d7e3a
CLN rename residual to neg_gradient
lorentzenchr Aug 5, 2023
983a53c
CLN tree value update
lorentzenchr Aug 5, 2023
f1f1ccb
ENH add compute_update to pull if-else of losses out of leaf loop
lorentzenchr Aug 5, 2023
ac2e852
TST add test_raise_if_init_has_no_predict_proba
lorentzenchr Aug 11, 2023
78176eb
Merge branch 'main' into gradient_boosting_common_loss
lorentzenchr Aug 11, 2023
bd0d548
ENH remove unweighted set_huber_delta
lorentzenchr Aug 12, 2023
342aeb9
CLN use check_is_fitted and remove NotFittedError
lorentzenchr Aug 12, 2023
e1a9435
CLN remove test_get_loss and do not allow BaseLoss as argument
lorentzenchr Aug 16, 2023
bcd3a04
CLN use encoded_y_int
lorentzenchr Aug 16, 2023
7d87787
CLN address review comments
lorentzenchr Aug 18, 2023
d0e9f7d
TST test_safe_divide
lorentzenchr Aug 19, 2023
7f49f3d
Merge branch 'main' into gradient_boosting_common_loss
lorentzenchr Aug 19, 2023
6e1fae2
MNT remove test_probability_exponential
lorentzenchr Aug 19, 2023
63087d9
ENH improve compute_update for ExponentialLoss
lorentzenchr Sep 6, 2023
6ef5c4d
DOC SquaredError specialties in update terminal regions
lorentzenchr Sep 6, 2023
8c01436
CLN address review comments
lorentzenchr Sep 6, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/common_pitfalls.rst
Original file line number Diff line number Diff line change
Expand Up @@ -211,7 +211,7 @@ method is used during fitting and predicting::
>>> from sklearn.model_selection import cross_val_score
>>> scores = cross_val_score(pipeline, X, y)
>>> print(f"Mean accuracy: {scores.mean():.2f}+/-{scores.std():.2f}")
Mean accuracy: 0.45+/-0.07
Mean accuracy: 0.46+/-0.07

How to avoid data leakage
-------------------------
Expand Down
5 changes: 5 additions & 0 deletions doc/whats_new/v1.4.rst
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,11 @@ Changelog
:pr:`13649` by :user:`Samuel Ronsin <samronsin>`,
initiated by :user:`Patrick O'Reilly <pat-oreilly>`.

- |Efficiency| :class:`ensemble.GradientBoostingClassifier` is faster,
for binary and in particular for multiclass problems thanks to the private loss
function module.
:pr:`26278` by :user:`Christian Lorentzen <lorentzenchr>`.

- |Efficiency| Improves runtime and memory usage for
:class:`ensemble.GradientBoostingClassifier` and
:class:`ensemble.GradientBoostingRegressor` when trained on sparse data.
Expand Down
Loading