-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
ENH Reduce copying when centering PDPs #23076
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc/ @glemaitre |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking care of it. I think that I only have a nitpick that could avoid some code redundancy.
Actually we did that in the former PR in one of the last commit :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the follow-up. LGTM once @glemaitre's comment is resolved
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Thanks @thomasjpfan |
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Reference Issues/PRs
Follow up to #18310
What does this implement/fix? Explain your changes.
Since
pdp_lim
already does the subtraction to compute the limits, I do not think we need to do the computation again in the private_plot
methods.Also I think kind="average" should center when
centered=True
, otherwise parts of the plot gets cut off. For example when running this:Code snippet
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import QuantileTransformer
from sklearn.neural_network import MLPRegressor
from sklearn.inspection import PartialDependenceDisplay
X, y = fetch_california_housing(as_frame=True, return_X_y=True)
y -= y.mean()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=0)
est = make_pipeline(
QuantileTransformer(),
MLPRegressor(
hidden_layer_sizes=(30, 15),
learning_rate_init=0.01,
early_stopping=True,
random_state=0,
),
)
est.fit(X_train, y_train)
common_params = {
"n_jobs": 2,
"grid_resolution": 10,
"centered": True,
"random_state": 0,
}
display = PartialDependenceDisplay.from_estimator(
est,
X_train,
features=["MedInc", "AveOccup", "HouseAge"],
kind="average",
**common_params,
)
display.figure_.suptitle("centered=True")
display.plot(centered=False)
_ = display.figure_.suptitle("centered=False")
Notice how on
main
thecentered=True
version has part of the plot cut offed.main
This PR