ENH Reduce copying when centering PDPs #23076
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reference Issues/PRs
Follow up to #18310
What does this implement/fix? Explain your changes.
Since
pdp_lim
already does the subtraction to compute the limits, I do not think we need to do the computation again in the private_plot
methods.Also I think kind="average" should center when
centered=True
, otherwise parts of the plot gets cut off. For example when running this:Code snippet
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import QuantileTransformer
from sklearn.neural_network import MLPRegressor
from sklearn.inspection import PartialDependenceDisplay
X, y = fetch_california_housing(as_frame=True, return_X_y=True)
y -= y.mean()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=0)
est = make_pipeline(
QuantileTransformer(),
MLPRegressor(
hidden_layer_sizes=(30, 15),
learning_rate_init=0.01,
early_stopping=True,
random_state=0,
),
)
est.fit(X_train, y_train)
common_params = {
"n_jobs": 2,
"grid_resolution": 10,
"centered": True,
"random_state": 0,
}
display = PartialDependenceDisplay.from_estimator(
est,
X_train,
features=["MedInc", "AveOccup", "HouseAge"],
kind="average",
**common_params,
)
display.figure_.suptitle("centered=True")
display.plot(centered=False)
_ = display.figure_.suptitle("centered=False")
Notice how on
main
thecentered=True
version has part of the plot cut offed.main
This PR