Closed
Description
Description
feature_selection.SelectFromModel within Pipeline.
The hyperparameters of estimator (used in SelectFromModel) are not changed in the Pipeline.
Steps/Code to Reproduce
import numpy as np
from sklearn.feature_selection import SelectFromModel
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.model_selection import ParameterGrid
N = 100
X = np.random.randn(N, 50)
y = np.random.randint(0, 2, N)
print X.shape, y.shape
pipe = Pipeline([('fs', SelectFromModel(estimator=LogisticRegression(penalty='l1'))),
('lr', LogisticRegression(penalty='l1'))])
parameters = {'fs__estimator__C': [1, 10], 'lr__C': [100]}
param_grid = ParameterGrid(parameters)
for param in param_grid:
pipe.set_params(**param)
d = pipe.get_params()
print('SELECTED:')
print('SelectFromModel - LogisticRegression: C', d['fs__estimator__C'])
pipe.fit(X, y)
print('ACTUAL:')
print(pipe.named_steps['fs'].estimator_)
Expected Results
The hyperparameters of SelectFromModel.estimator_ should change with set_params method.
Actual Results
Function set_params actually modifies SelectFromModel.estimator but SelectFromModel.estimator_ is used instead with preset params.
The repeated use of SelectFromModel does not change SelectFromModel.estimator_
cf. fit method:
if not hasattr(self, "estimator_"):
self.estimator_ = clone(self.estimator)
self.estimator_.fit(X, y, **fit_params)
Versions
Linux-4.4.0-45-generic-x86_64-with-Ubuntu-16.04-xenial
('Python', '2.7.12 (default, Jul 1 2016, 15:12:24) \n[GCC 5.4.0 20160609]')
('NumPy', '1.11.2')
('SciPy', '0.18.1')
('Scikit-Learn', '0.19.dev0')