[MRG] FIX max_leaf_node and max_depth interaction in GBDT #16183

NicolasHug · 2020-01-23T12:20:23Z

The fix is simply to check for max_samples_leaf before checking for max_depth.

This makes sure that _finalize_splittable_nodes is called, which was the cause of the bug.

glemaitre · 2020-01-23T19:46:28Z

sklearn/ensemble/_hist_gradient_boosting/tests/test_grower.py

@@ -280,6 +283,18 @@ def test_max_depth(max_depth):
    assert depth == max_depth


+def test_max_depth_max_leaf_nodes():


You don't think that we should move this test in test_gradient_boosting.py instead of the test_grower.py?

No strong opinion. It's really a test about the grower code more than the estimator but yeah it's not obvious to come up with something that would only involve the grower

Okay this may seem a little crazy, but:

def test_max_depth_max_leaf_nodes(): n_samples = 1000 n_bins = 64 max_depth = 4 max_leaf_nodes = 9 # 2**(max_depth - 1) + 1 X, y = make_classification(n_samples=n_samples, random_state=42) X_binned = _BinMapper(n_bins=n_bins, random_state=42).fit_transform(X) y = y.astype(Y_DTYPE) loss = LeastSquares() baseline = loss.get_baseline_prediction(y, 1) raw_predictions = np.full(shape=(1, n_samples), fill_value=baseline) gradients, hessians = loss.init_gradients_and_hessians( n_samples=n_samples, prediction_dim=1) for i in range(5): loss.update_gradients_and_hessians(gradients, hessians, y, raw_predictions) grower = TreeGrower(X_binned, gradients[0, :], hessians[0, :], n_bins=n_bins, max_leaf_nodes=max_leaf_nodes, max_depth=max_depth) grower.grow() _update_raw_predictions(raw_predictions[0, :], grower) assert len(grower.finalized_leaves) <= max_leaf_nodes

…/NicolasHug/16183

NicolasHug · 2020-01-24T18:59:51Z

you might like this new solution better @thomasjpfan .

NicolasHug · 2020-01-24T19:32:59Z

failure is unrelated

…x_maxleafnodes_maxdepth

…kit-learn into fix_maxleafnodes_maxdepth

NicolasHug · 2020-02-03T13:32:50Z

sklearn/ensemble/_hist_gradient_boosting/grower.py

@@ -355,16 +355,16 @@ def split_next(self):

        self.n_nodes += 2

-        if self.max_depth is not None and depth == self.max_depth:
+        if (self.max_leaf_nodes is not None


Diff might be confusing but all I did was check for max_leaf_nodes before checking for max_depth

This reordered was noticed ;)

NicolasHug · 2020-02-03T13:33:26Z

Thanks for the reviews so far. I have moved the test and simplified it to a 3-liner. I confirm it's a non-regression test that fails on master

thomasjpfan

LGTM

Can confirm that test fails on master.

thomasjpfan · 2020-02-03T13:57:27Z

sklearn/ensemble/_hist_gradient_boosting/grower.py

@@ -355,16 +355,16 @@ def split_next(self):

        self.n_nodes += 2

-        if self.max_depth is not None and depth == self.max_depth:
+        if (self.max_leaf_nodes is not None


This reordered was noticed ;)

adrinjalali

love the final fix lol.

* fix max_leaf_node max_depth interaction * Added test * comment * what's new * simpler solution * moved and simplified test * typo

NicolasHug added 2 commits January 23, 2020 07:18

fix max_leaf_node max_depth interaction

775246e

Added test

c298728

NicolasHug changed the title ~~[WIP] FIX max_leaf_node and max_depth interaction in GBDT~~ [MRG] FIX max_leaf_node and max_depth interaction in GBDT Jan 23, 2020

NicolasHug added 2 commits January 23, 2020 11:42

comment

c2350d5

what's new

53c46d7

glemaitre reviewed Jan 23, 2020

View reviewed changes

NicolasHug added 2 commits January 24, 2020 13:58

simpler solution

7402545

Merge branch 'master' of github.com:scikit-learn/scikit-learn into pr…

7d4320b

…/NicolasHug/16183

NicolasHug added 4 commits January 27, 2020 09:58

Merge branch 'master' of github.com:scikit-learn/scikit-learn into fi…

a74968b

…x_maxleafnodes_maxdepth

Merge branch 'master' of github.com:scikit-learn/scikit-learn into fi…

02c4d6c

…x_maxleafnodes_maxdepth

Merge branch 'fix_maxleafnodes_maxdepth' of github.com:NicolasHug/sci…

5b2c7f3

…kit-learn into fix_maxleafnodes_maxdepth

moved and simplified test

5f9fe0e

NicolasHug commented Feb 3, 2020

View reviewed changes

typo

dcb4027

thomasjpfan approved these changes Feb 3, 2020

View reviewed changes

adrinjalali approved these changes Feb 7, 2020

View reviewed changes

adrinjalali merged commit 98b3c7c into scikit-learn:master Feb 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[MRG] FIX max_leaf_node and max_depth interaction in GBDT #16183

[MRG] FIX max_leaf_node and max_depth interaction in GBDT #16183

Uh oh!

NicolasHug commented Jan 23, 2020 •

edited

Loading

Uh oh!

glemaitre Jan 23, 2020

Uh oh!

NicolasHug Jan 23, 2020

Uh oh!

thomasjpfan Feb 1, 2020

Uh oh!

NicolasHug commented Jan 24, 2020

Uh oh!

NicolasHug commented Jan 24, 2020

Uh oh!

NicolasHug Feb 3, 2020

Uh oh!

thomasjpfan Feb 3, 2020

Uh oh!

NicolasHug commented Feb 3, 2020

Uh oh!

thomasjpfan left a comment

Uh oh!

thomasjpfan Feb 3, 2020

Uh oh!

adrinjalali left a comment

Uh oh!

Uh oh!

		@@ -280,6 +283,18 @@ def test_max_depth(max_depth):
		assert depth == max_depth


		def test_max_depth_max_leaf_nodes():

Uh oh!

[MRG] FIX max_leaf_node and max_depth interaction in GBDT #16183

[MRG] FIX max_leaf_node and max_depth interaction in GBDT #16183

Uh oh!

Conversation

NicolasHug commented Jan 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

glemaitre Jan 23, 2020

Choose a reason for hiding this comment

Uh oh!

NicolasHug Jan 23, 2020

Choose a reason for hiding this comment

Uh oh!

thomasjpfan Feb 1, 2020

Choose a reason for hiding this comment

Uh oh!

NicolasHug commented Jan 24, 2020

Uh oh!

NicolasHug commented Jan 24, 2020

Uh oh!

NicolasHug Feb 3, 2020

Choose a reason for hiding this comment

Uh oh!

thomasjpfan Feb 3, 2020

Choose a reason for hiding this comment

Uh oh!

NicolasHug commented Feb 3, 2020

Uh oh!

thomasjpfan left a comment

Choose a reason for hiding this comment

Uh oh!

thomasjpfan Feb 3, 2020

Choose a reason for hiding this comment

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

NicolasHug commented Jan 23, 2020 •

edited

Loading