-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
GridSearchCV with Pipeline without Predictor #14693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The issue is not the transformer, but using It seems strange to have the output of a transformer be compared to the target with def trans_r2(est, X, y):
return r2_score(y, est.transform(X))
GridSearchCV(..., scoring=trans_r2) ps: posting the full traceback would have revealed that ;) |
(renamed issue as "estimator" includes transformers) |
Oh, I see. Thanks @amueller .
Well, I guess this would make sense if the transformer is a prediction post-processing step (e.g. smoothing for time-series regression as per my example above). But I understand this cannot be supported (or there are no plans of doing so) given that you can't have a predictor step, unless it is the last one, right? |
Perhaps you also want TransformedTargetRegressor
|
Target post-processing is a bit tricky unfortunately. I think you can use |
Description
The Pipeline documentation states that:
However, trying to use
GridSearchCV
with aPipeline
that only includes transformers will fail (in the hypothetical scenario that you are trying to select hyper-parameters for a transformer only; this can be useful if the transformer in question is a post-processing step that takes some hyper-parameters).Steps/Code to Reproduce
For instance, the following (dumb example) will raise an error:
Output:
However, if you include a dumb estimator that simply passes its input to the output at the end of the pipeline, it will run with no issues:
Output:
I realise that this is not a common case (needing to only fit a transformer with hyper-parameters), but is this intended behaviour?
On a related note, is it possible to include a post-processing transformer (e.g. smoothing for time-series regression or even a scaler in case the target has been pre-processed) at the end of a
Pipeline
and still be able to useGridSearchCV
? According to the documentation it shouldn't be, since in that case not all layers preceding the last one are transformers. See also #4143.Versions
System:
python: 3.6.8 |Anaconda, Inc.| (default, Feb 21 2019, 18:30:04) [MSC v.1916 64 bit (AMD64)]
executable: C:\Users\nak142\Miniconda3\envs\myo\python.exe
machine: Windows-10-10.0.18362-SP0
BLAS:
macros:
lib_dirs:
cblas_libs: cblas
Python deps:
pip: 19.1.1
setuptools: 41.0.1
sklearn: 0.21.2
numpy: 1.16.4
scipy: 1.2.1
Cython: 0.29.12
pandas: 0.24.2
The text was updated successfully, but these errors were encountered: