Skip to content

Implications of FrozenEstimator on our API #29893

@thomasjpfan

Description

@thomasjpfan

With #29705, we have a simple way to freeze estimators, which means there is no need for cv="prefit". This also opens the door for #8350 to make Pipeline and FeatureUnion follow our conventions. This issue is to discuss the API implications of introducing FrozenEstimator. Here are the two I had in mind:

cv="prefit"

For the cv case, users pass a frozen estimator directly into cv:

rf = RandomForestClassifer()
rf.fit(X_train, y_train)
frozen_rf = FrozenEstimator(rf)

calibration = CalibratedClassifierCV(frozen_rf)
calibration.fit(X_calib, y_calib)

Making this change will simplify our codebase with cv="prefit"

compose.Pipeline

We introduce a new compose.Pipeline which follows our conventions with clone. (The current pipeline.Pipeline does not clone.)

from sklearn.compose import Pipeline

prep = ColumnTransformer(...)
prep.fit(X_train, y_train)
frozen_prep = FrozenEstimator(prep)

pipe = Pipeline([frozen_prep, LogisticRegression()])

pipe.fit(X_another, y_another)

In both cases, I like prefer the semantics of FrozenEstimator.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions