GridSearchCV saves all fitted estimator in cv_results['params'] when params are estimators

#### Description
I use GridSearchCV to optimize the hyperparameters of a pipeline. I set the param grid by inputing transformers or estimators at different steps of the pipeline, following the Pipeline documentation:

> A step’s estimator may be replaced entirely by setting the parameter with its name to another estimator, or a transformer removed by setting to None.

I couldn't figure why dumping cv_results_ would take so much memory on disk. It happens that cv_results_['params'] and all cv_results_['param_*'] objects contains fitted estimators, as much as there are points on my grid.

This bug should happen only when n_jobs = 1 (which is my usecase).

I don't think this is intended (else the arguments and attributes _refit_ and _best_\__estimator_ wouldn't be used).

My guess is that during the grid search, those estimator's aren't cloned before use (which could be a problem if using the same grid search several times, because estimators passed in the next grid would be fitted...).

#### Version: 0.19.0



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

GridSearchCV saves all fitted estimator in cv_results['params'] when params are estimators #10063

Description

Version: 0.19.0

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

GridSearchCV saves all fitted estimator in cv_results['params'] when params are estimators #10063

Description

Description

Version: 0.19.0

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions