[MRG] Take sample weights into account in partial dependence computation #13193

samronsin · 2019-02-19T17:11:15Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

This PR makes partial_dependence take sample weights into account by replacing n_node_samples by weighted_n_node_samples in partial dependence computation.

…ce computation

jnothman · 2019-02-19T21:48:48Z

Can a test be added?

samronsin · 2019-02-28T15:05:01Z

I just added a few ! Should add some more ?

samronsin · 2019-03-11T14:28:54Z

Ping @NicolasHug as discussed during the sprint !

NicolasHug

The fix looks correct to me.

Please add a whatsnew entry as bugfix.

Also, it'd be nice to have some kind of functional test. Something along the lines of your example in the original issue would be good? Fitting a linear regression on the PDPs should give an r-squared close to 1.

This is fine to merge before or after #12599 BTW, one of us will have to update its PR regarding the tests ;)

NicolasHug · 2019-03-12T16:59:53Z

sklearn/ensemble/_gradient_boosting.pyx

-                        raise ValueError("left_sample_frac:%f, "
-                                         "n_samples current: %d, "
-                                         "n_samples left: %d"
+                        raise ValueError("left_sample_frac:%d, "


IMO this whole if should be entirely removed. It's not tested anywhere as far as I can see, and this is not PDP-related (this is just tree related).

sklearn/ensemble/tests/test_partial_dependence.py

…into add-sample-weights-gbt-partial-dependency

jnothman · 2019-03-12T22:36:52Z

tests still failing. Please add an entry to doc/whats_new/v0.21.rst

samronsin · 2019-03-13T00:07:42Z

Thanks for the review @NicolasHug -- I added a functional test based on the example built for the original issue.
@jnothman tests should be fixed now !

sklearn/ensemble/tests/test_partial_dependence.py

jnothman

Yes, much cleaner. LGTM if tests pass

doc/whats_new/v0.21.rst

NicolasHug

Last comments but LGTM!

doc/whats_new/v0.21.rst

sklearn/ensemble/tests/test_partial_dependence.py

NicolasHug · 2019-03-27T11:56:15Z

Anyone for a quick review? @thomasjpfan maybe?

sklearn/ensemble/tests/test_partial_dependence.py

NicolasHug · 2019-04-05T14:00:50Z

@samronsin Can you please merge master (or trigger the CI with an empty commit) so that checks go green so we can merge :)?

…into add-sample-weights-gbt-partial-dependency

NicolasHug · 2019-04-05T15:47:57Z

Merging since the failed test is completely unrelated (mlp.test_gradient, likely due to random state not being set).

Thanks @samronsin !

samronsin · 2019-04-05T15:58:17Z

Thanks @NicolasHug -- and also @jnothman and @thomasjpfan -- for your help on this PR !

…n for gradient boosting (scikit-learn#13193) * Replace n_node_samples by weighted_n_node_samples in partial dependence computation * Add tests for both no-op and real sample weights * Improve naming and remove useless comment * Fix small test issues * Fix test for binary classification * Add test for regressions based on example from initial issue * Edit whats_new * 79 * Simplify test code for regression partial dependence * PEP8 * Facepalm * Refer to the public function in whats_new * Make the sample weight test standalone for further reuse * Fix PR number * Testing with L1 relative distance computed as averages * Testing element-wise * Fix and simplify unit test for binary classification * Clarify functional test

…mputation for gradient boosting (scikit-learn#13193)" This reverts commit 4350b4e.

…n for gradient boosting (scikit-learn#13193) * Replace n_node_samples by weighted_n_node_samples in partial dependence computation * Add tests for both no-op and real sample weights * Improve naming and remove useless comment * Fix small test issues * Fix test for binary classification * Add test for regressions based on example from initial issue * Edit whats_new * 79 * Simplify test code for regression partial dependence * PEP8 * Facepalm * Refer to the public function in whats_new * Make the sample weight test standalone for further reuse * Fix PR number * Testing with L1 relative distance computed as averages * Testing element-wise * Fix and simplify unit test for binary classification * Clarify functional test

Replace n_node_samples by weighted_n_node_samples in partial dependen…

bdd5b93

…ce computation

Add tests for both no-op and real sample weights

74b1290

Improve naming and remove useless comment

4da517e

samronsin changed the title ~~Take sample weights into account in partial dependence computation~~ [MRG] Take sample weights into account in partial dependence computation Feb 28, 2019

NicolasHug reviewed Mar 12, 2019

View reviewed changes

samronsin added 2 commits March 12, 2019 22:48

Fix small test issues

0017f90

Merge branch 'master' of https://github.com/scikit-learn/scikit-learn …

d7edb77

…into add-sample-weights-gbt-partial-dependency

jnothman approved these changes Mar 12, 2019

View reviewed changes

samronsin added 4 commits March 13, 2019 00:16

Fix test for binary classification

8bc07ba

Add test for regressions based on example from initial issue

bef200f

Edit whats_new

76a8283

79

e4728c9

jnothman reviewed Mar 13, 2019

View reviewed changes

sklearn/ensemble/tests/test_partial_dependence.py Outdated Show resolved Hide resolved

sklearn/ensemble/tests/test_partial_dependence.py Outdated Show resolved Hide resolved

sklearn/ensemble/tests/test_partial_dependence.py Outdated Show resolved Hide resolved

samronsin added 3 commits March 13, 2019 10:30

Simplify test code for regression partial dependence

bb17660

PEP8

7c32edb

Facepalm

8e9a86d

jnothman approved these changes Mar 13, 2019

View reviewed changes

jnothman reviewed Mar 13, 2019

View reviewed changes

doc/whats_new/v0.21.rst Outdated Show resolved Hide resolved

Refer to the public function in whats_new

00372bf

NicolasHug approved these changes Mar 13, 2019

View reviewed changes

doc/whats_new/v0.21.rst Outdated Show resolved Hide resolved

sklearn/ensemble/tests/test_partial_dependence.py Outdated Show resolved Hide resolved

sklearn/ensemble/tests/test_partial_dependence.py Show resolved Hide resolved

samronsin added 2 commits March 13, 2019 16:32

Make the sample weight test standalone for further reuse

de7efa5

Fix PR number

fb7aa99

NicolasHug reviewed Mar 13, 2019

View reviewed changes

sklearn/ensemble/tests/test_partial_dependence.py Outdated Show resolved Hide resolved

samronsin added 2 commits March 13, 2019 17:29

Testing with L1 relative distance computed as averages

a3198c0

Testing element-wise

7da3038

thomasjpfan reviewed Mar 27, 2019

View reviewed changes

sklearn/ensemble/tests/test_partial_dependence.py Outdated Show resolved Hide resolved

thomasjpfan reviewed Mar 27, 2019

View reviewed changes

sklearn/ensemble/tests/test_partial_dependence.py Show resolved Hide resolved

samronsin added 2 commits March 27, 2019 16:27

Fix and simplify unit test for binary classification

1db7427

Clarify functional test

a9ac018

Merge branch 'master' of https://github.com/scikit-learn/scikit-learn …

c599099

…into add-sample-weights-gbt-partial-dependency

NicolasHug merged commit d0747ea into scikit-learn:master Apr 5, 2019

NicolasHug mentioned this pull request Apr 5, 2019

Use fixed random seed in test_mlp.test_gradient #13581

Closed

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

Revert "FIX Take sample weights into account in partial dependence co…

ff75f8a

…mputation for gradient boosting (scikit-learn#13193)" This reverts commit 4350b4e.

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

Revert "FIX Take sample weights into account in partial dependence co…

6eedd6d

…mputation for gradient boosting (scikit-learn#13193)" This reverts commit 4350b4e.

NicolasHug mentioned this pull request Sep 5, 2019

ENH Support sample weights in HGBT #14696

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG] Take sample weights into account in partial dependence computation #13193

[MRG] Take sample weights into account in partial dependence computation #13193

samronsin commented Feb 19, 2019

jnothman commented Feb 19, 2019

samronsin commented Feb 28, 2019

samronsin commented Mar 11, 2019

NicolasHug left a comment

NicolasHug Mar 12, 2019

jnothman commented Mar 12, 2019

samronsin commented Mar 13, 2019

jnothman left a comment

NicolasHug left a comment

NicolasHug commented Mar 27, 2019

NicolasHug commented Apr 5, 2019

NicolasHug commented Apr 5, 2019

samronsin commented Apr 5, 2019

[MRG] Take sample weights into account in partial dependence computation #13193

[MRG] Take sample weights into account in partial dependence computation #13193

Conversation

samronsin commented Feb 19, 2019

Reference Issues/PRs

What does this implement/fix? Explain your changes.

jnothman commented Feb 19, 2019

samronsin commented Feb 28, 2019

samronsin commented Mar 11, 2019

NicolasHug left a comment

Choose a reason for hiding this comment

NicolasHug Mar 12, 2019

Choose a reason for hiding this comment

jnothman commented Mar 12, 2019

samronsin commented Mar 13, 2019

jnothman left a comment

Choose a reason for hiding this comment

NicolasHug left a comment

Choose a reason for hiding this comment

NicolasHug commented Mar 27, 2019

NicolasHug commented Apr 5, 2019

NicolasHug commented Apr 5, 2019

samronsin commented Apr 5, 2019