-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
FIX solve wrong indexing in plots with ICE when subsampling #18359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
😞 Yes, the indexing seems to be quite off. I am relieved this was found out before the feature was released. Thanks, @glemaitre for the quick fix. |
Don't worry we already released bugs in the past :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the code quite complex to review. I am not even sure what was the actual indexing bug to look for. Could you please describe it in two words? Would it happen in the "both"
case? I am not sure I can see the difference visually on the rendered HTML example.
Anyways, here are some comment to try to refactor this code to make it easier to follow by introducing sub-methods:
Our plotting was bug resilient :) from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.inspection import plot_partial_dependence
X, y = make_classification()
model = LogisticRegression().fit(X, y)
disp = plot_partial_dependence(
model, X, features=[0, 1, 2, 3], kind="both",
subsample=3
) The axis will be organized on a grid of (2, 3) disp.axes_.shape
(2, 3) You can access the line of a plot with disp.lines_[0, 1].shape
(4,) The array disp.lines_[0, 1]
[<matplotlib.lines.Line2D object at 0x7fc12c0143a0> None
<matplotlib.lines.Line2D object at 0x7fc12bff69a0> None] So we have 2 values at None because we did not place the line at the right place. My non-regression test check that there are no |
I think that we should use |
I refactor the code with a 1-way private function calling the ICE and PDP plotting function and a 2-way plotting function. I still have to document and maybe add a couple of comments to clarify the indexing or some parameters. |
@ogrisel The code is ready for another review. A bit less easy to review but I think the refactoring makes it easier to go through the code, hopefully :). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more nitpicks but otherwise this looks good enough :)
ping @thomasjpfan I think that it should be ready now. |
pinging @NicolasHug and @adrinjalali |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The black formatting + refactoring did make this a little harder to review. But having looked through it all, this LGTM!
This still needs a whats new for the random_state
parameter and the bug fix.
@thomasjpfan no need for the what's new because we did not release it yet. |
This PR proposes the following changes:
random_state
parameter