-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
Open
Description
With #29705, we have a simple way to freeze estimators, which means there is no need for cv="prefit"
. This also opens the door for #8350 to make Pipeline
and FeatureUnion
follow our conventions. This issue is to discuss the API implications of introducing FrozenEstimator
. Here are the two I had in mind:
cv="prefit"
For the cv case, users pass a frozen estimator directly into cv:
rf = RandomForestClassifer()
rf.fit(X_train, y_train)
frozen_rf = FrozenEstimator(rf)
calibration = CalibratedClassifierCV(frozen_rf)
calibration.fit(X_calib, y_calib)
Making this change will simplify our codebase with cv="prefit"
compose.Pipeline
We introduce a new compose.Pipeline
which follows our conventions with clone
. (The current pipeline.Pipeline
does not clone.)
from sklearn.compose import Pipeline
prep = ColumnTransformer(...)
prep.fit(X_train, y_train)
frozen_prep = FrozenEstimator(prep)
pipe = Pipeline([frozen_prep, LogisticRegression()])
pipe.fit(X_another, y_another)
In both cases, I like prefer the semantics of FrozenEstimator
.
jjerphan, Charlie-XIAO, adam2392, lorentzenchr and Tialo