-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
FIX Avoid fitting a pipeline without steps #31723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -1289,7 +1291,6 @@ def __sklearn_is_fitted__(self): | |||
|
|||
An empty pipeline is considered fitted. | |||
""" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is where I wanted to to check if the pipeline had steps. But this means the display will fail as well.
And I just saw this: "An empty pipeline is considered fitted." This means that the HTML display is actually correct.
I think we need to be careful before requiring non-empty steps to a Pipeline. We never know how users use the estimators. For instance you can create an empty pipeline and set the steps afterwards using That being said, checking for non-empty steps in
Pipeline(steps=[]).fit(X,y) # fails
Pipeline(steps=[("a", "passthrough")]).fit(X,y) # works I'd like to have the opinion of @glemaitre, @adrinjalali, or @ogrisel before making a decision here. |
I think the "empty is passthrough' semantics feel quite natural. |
Since right now fitting an empty pipeline fails anyway, I'm happy with this PR since it makes the error message much more user friendly and doesn't really change any behavior. |
Right now you can't use a In a new PR we can, if we decide to do it, start allowing empty pipelines. |
Right know, this works in jupyter:
If we change the behaviour of
|
I am not sure I understand why we should raise an exception from |
I thought that it was what you meant (in the issue). 🫣 |
I don't think |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Not sure it requires a changelog entry since it's just a more informative error message. I let you decide
I don't need one. But maybe you were asking the others? |
Reference Issues/PRs
What does this implement/fix? Explain your changes.
Trying to fix #31700
Any other comments?
I wanted add the check to
__sklearn_is_fitted__
as well. That way we avoid an emptyPipeline
to be displayed as fitted (in an HTML display), and we have a clear message that it needs more info. But this may be incorrect.If the empty
Pipeline
display needs to be kept (but shown as "not fitted"), then estimator.py needs love.Friendly ping @betatim @glemaitre