-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
DOC Fix tree explanation of tree_.value
in example
#29331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Adam Li <adam2392@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @adam2392, thanks for addressing this issue :) Here's a couple of comments:
-
On top of the API changelog entry, we might want to also include a
versionchanged
directive for each of the attributes on those estimators. -
For the FIX changelog entry, I think you are linking to an unrelated PR, but in any case, the
proportion
parameter ofplot_tree
function has not changed in version 1.6 as far as I know, or am I misunderstanding something?
The
The PR that I linked is when the behavior changed such that |
Signed-off-by: Adam Li <adam2392@gmail.com>
The issues raised are fixed now:
|
Signed-off-by: Adam Li <adam2392@gmail.com>
I would rather not change the behavior of |
Kay that makes sense! Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of final comments but otherwise LGTM :)
doc/whats_new/v1.4.rst
Outdated
:pr:`13649` by :user:`Samuel Ronsin <samronsin>`, initiated by | ||
:user:`Patrick O'Reilly <pat-oreilly>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pinging @glemaitre for his opinion to rather linking #27639 here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#27639 indeed seems more appropriate, since this is where the choice of switching to a weighted proportion was made IIUC.
For better visibility I would also put it in a changed models section i.e. a lot more towards the top. For example there is such a section in 1.5 but a new one needs to be added for 1.4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edited and moved to the top.
tree_.value
in exampletree_.value
in example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot, a few comments from me
doc/whats_new/v1.4.rst
Outdated
:pr:`13649` by :user:`Samuel Ronsin <samronsin>`, initiated by | ||
:user:`Patrick O'Reilly <pat-oreilly>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#27639 indeed seems more appropriate, since this is where the choice of switching to a weighted proportion was made IIUC.
For better visibility I would also put it in a changed models section i.e. a lot more towards the top. For example there is such a section in 1.5 but a new one needs to be added for 1.4.
Signed-off-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
# | ||
# One could convert this to the absolute weighted number of samples reaching a node, | ||
# by multiplying this number by `tree_.weighted_n_node_samples[node_idx]` for the | ||
# given node, or by `tree_.n_node_samples[node_idx]` for the unweighted number of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit unsure about multiplying tree_.value
(proportion of samples taking into account sample weights) by tree_.n_node_samples[node_idx]
which does not take sample weights into account. It seems to me that the output of this multiplication has not a clear meaning, but maybe I am missing something ...
Maybe only mention tree_.weighted_n_nodes_sample
and mention that sample weights are not used in this example, which is the same as being one for all the samples?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done: 3c1f24a
|
||
tree.plot_tree(clf) | ||
tree.plot_tree(clf, proportion=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that works for this example, but in an ideal world (not this PR) it would be nice if by default plot_tree
would show something that matches tree_.value
. Is it worth doing a 2 release deprecation period where you say that proportion
default value is going to switch from False
to True
. Could we bend our rules and switch to proportion=True
directly?
I'd slightly lean towards bending our rules and switch to proportion=True
by default but this is a bit controversial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally did deprecation, but @ArturoAmorQ (#29331 (comment)) suggested I keep the default behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I honestly think either way is okay as this is like power-user stuff
Signed-off-by: Adam Li <adam2392@gmail.com>
OK thanks a lot for your patience and continued work on this @adam2392, let's merge this one! |
@adam2392 if you feel adventurous you may want to open an issue about aggressively switching the default to |
Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com> Co-authored-by: Loïc Estève <loic.esteve@ymail.com>
Reference Issues/PRs
Fixes: #28921
What does this implement/fix? Explain your changes.
Fixes the explanation of
tree_.value
as suggested in #28921 (comment).A follow-on PR that changes the name of
tree_.value
with a deprecation totree_.weighted_proportion
seems better and more explicit.Any other comments?
cc: @lesteve