-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Pipeline pop #8448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pipeline pop #8448
Conversation
Conceptually Fixes scikit-learn#8414 and related issues. Alternative to scikit-learn#2568 without __getitem__ and mixed semantics. Designed to assist in model inspection and particularly to replicate the composite transformer represented by steps of the pipeline with the exception of the last. I.e. pipe.get_subsequence(0, -1) is a common idiom. I feel like this becomes more necessary when considering more API-consistent clone behaviour as per scikit-learn#8350 as Pipeline(pipe.steps[:-1]) is no longer possible.
Codecov Report
@@ Coverage Diff @@
## master #8448 +/- ##
==========================================
+ Coverage 95.47% 95.48% +<.01%
==========================================
Files 342 342
Lines 60907 60946 +39
==========================================
+ Hits 58154 58193 +39
Misses 2753 2753
Continue to review full report at Codecov.
|
@jnothman I am for checking that the I don't see the user case for popping an unfitted |
Thinking about this on my day away from computers, I've realised that
popping anything other than head or tail is strange when dealing with a
fitted pipeline. Splitting the pipeline at an arbitrary point makes some
sense... but I like the interface here which gives you back the estimator
rather than two pipelines.
…On 25 February 2017 at 04:48, Guillaume Lemaitre ***@***.***> wrote:
@jnothman <https://github.com/jnothman> I am for checking that the
Pipeline was fitted.
I don't see the user case for popping an unfitted Pipeline despite if one
wants to avoid rebuilding a Pipeline instance from scratch.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8448 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz66l5NFQ9m_AeewQi7EOmemFqgA18ks5rfxf3gaJpZM4MKv-3>
.
|
I would add an optional "start" though I don't have a strong opinion about it. |
I wouldn't return the popped-off element, I think that's rarely useful. Checking whether it's fitted or not doesn't make a difference now but will in the future if we clone. That was the discussion point, right? |
Returning the popped element as well as the head *is* useful as shown in
#8431 (comment),
since the popped element stores the feature importances with respect to the
head's transformations.
|
I don't have a strong opinion on this. |
List slicing semantics continue to make a lot of sense here. Passing a
slice would return a pipeline containing the same elements, passing an
index returns just the estimator.
|
How would you implement that? Using |
Yes, slicing would be with `__getitem__` (that's the only sensible way to
work with slices, no?)
|
Conceptually Fixes #8414 and related issues. Alternative to #8431, #2568, #2561, #2562, focusing on extracting one specified estimator.
Designed to assist in model inspection and particularly to replicate the
composite transformer represented by steps of the pipeline with the
exception of the last. I.e.
transformer, predictor = pipe.pop()
is a commonidiom. I feel like this becomes more necessary when considering more
API-consistent clone behaviour as per #8350 as
Pipeline(pipe.steps[:-1]).inverse_transform(Xt)
is no longer possible.
Note that once #8350 is merged, I intend that the popped estimator reflect the instance in
steps_
. I'm not sure what behaviour should be when thePipeline
is not fitted andpop
is called.Ping @glemaitre, @GaelVaroquaux
TODO: