Pipeline pop #8448

jnothman · 2017-02-24T02:23:29Z

Conceptually Fixes #8414 and related issues. Alternative to #8431, #2568, #2561, #2562, focusing on extracting one specified estimator.

Designed to assist in model inspection and particularly to replicate the
composite transformer represented by steps of the pipeline with the
exception of the last. I.e. transformer, predictor = pipe.pop() is a common
idiom. I feel like this becomes more necessary when considering more
API-consistent clone behaviour as per #8350 as Pipeline(pipe.steps[:-1]).inverse_transform(Xt)
is no longer possible.

Note that once #8350 is merged, I intend that the popped estimator reflect the instance in steps_. I'm not sure what behaviour should be when the Pipeline is not fitted and pop is called.

Ping @glemaitre, @GaelVaroquaux

TODO:

wait for [MRG] FIX Modify the API of Pipeline and FeatureUnion to match common scikit-learn estimators conventions #8350 and decide whether pop is only available when fitted.
narrative docs
use in existing example?

Conceptually Fixes scikit-learn#8414 and related issues. Alternative to scikit-learn#2568 without __getitem__ and mixed semantics. Designed to assist in model inspection and particularly to replicate the composite transformer represented by steps of the pipeline with the exception of the last. I.e. pipe.get_subsequence(0, -1) is a common idiom. I feel like this becomes more necessary when considering more API-consistent clone behaviour as per scikit-learn#8350 as Pipeline(pipe.steps[:-1]) is no longer possible.

codecov · 2017-02-24T03:08:01Z

Codecov Report

Merging #8448 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #8448      +/-   ##
==========================================
+ Coverage   95.47%   95.48%   +<.01%     
==========================================
  Files         342      342              
  Lines       60907    60946      +39     
==========================================
+ Hits        58154    58193      +39     
  Misses       2753     2753

Impacted Files	Coverage Δ
sklearn/tests/test_pipeline.py	`99.63% <100%> (+0.02%)`	✅
sklearn/pipeline.py	`99.28% <100%> (+0.02%)`	✅

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7f084b0...a60cc33. Read the comment docs.

glemaitre · 2017-02-24T17:48:37Z

@jnothman I am for checking that the Pipeline was fitted.

I don't see the user case for popping an unfitted Pipeline despite if one wants to avoid rebuilding a Pipeline instance from scratch.

jnothman · 2017-02-25T22:43:08Z

Thinking about this on my day away from computers, I've realised that popping anything other than head or tail is strange when dealing with a fitted pipeline. Splitting the pipeline at an arbitrary point makes some sense... but I like the interface here which gives you back the estimator rather than two pipelines.

…

On 25 February 2017 at 04:48, Guillaume Lemaitre ***@***.***> wrote: @jnothman <https://github.com/jnothman> I am for checking that the Pipeline was fitted. I don't see the user case for popping an unfitted Pipeline despite if one wants to avoid rebuilding a Pipeline instance from scratch. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#8448 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz66l5NFQ9m_AeewQi7EOmemFqgA18ks5rfxf3gaJpZM4MKv-3> .

amueller · 2018-11-27T20:27:41Z

I would add an optional "start" though I don't have a strong opinion about it.

amueller · 2018-11-27T20:31:30Z

I wouldn't return the popped-off element, I think that's rarely useful.
Either way it doesn't have the same interface as dict.pop or DataFrame.pop which only return the popped element.

Checking whether it's fitted or not doesn't make a difference now but will in the future if we clone. That was the discussion point, right?

jnothman · 2018-11-28T12:38:58Z

Returning the popped element as well as the head *is* useful as shown in #8431 (comment), since the popped element stores the feature importances with respect to the head's transformations.

amueller · 2018-11-28T16:53:09Z

I don't have a strong opinion on this.
Getting the last step is also easy with pipe.steps_[-1][1] or pipe.named_steps.name_of_estimator

jnothman · 2018-11-28T22:32:17Z

List slicing semantics continue to make a lot of sense here. Passing a slice would return a pipeline containing the same elements, passing an index returns just the estimator.

amueller · 2018-11-29T17:23:48Z

How would you implement that? Using __getitem__?
I am warming to head for a slice and I feel we already have good ways to get single elements.
Head could have a skip to not start at the first if you don't want to include the first. I think this is a rarer use-case, though.

jnothman · 2018-12-03T07:40:52Z

Yes, slicing would be with `__getitem__` (that's the only sensible way to work with slices, no?)

jnothman added 2 commits February 22, 2017 18:59

ENH define pop instead of get_subsequence

a60cc33

jnothman mentioned this pull request Mar 4, 2017

API for autocompletable attributes on pipeline #8481

Closed

amueller mentioned this pull request Nov 27, 2018

RFC Implement Pipeline get feature names #12627

Closed

3 tasks

amueller mentioned this pull request Feb 7, 2019

SLEP needed: slicling pipelines scikit-learn/enhancement_proposals#13

Closed

jnothman mentioned this pull request Feb 28, 2019

[MRG+1] Pipeline can now be sliced or indexed #2568

Merged

adrinjalali closed this in #2568 Mar 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline pop #8448

Pipeline pop #8448

jnothman commented Feb 24, 2017

codecov bot commented Feb 24, 2017 •

edited

Loading

glemaitre commented Feb 24, 2017

jnothman commented Feb 25, 2017 via email

amueller commented Nov 27, 2018

amueller commented Nov 27, 2018

jnothman commented Nov 28, 2018 via email

amueller commented Nov 28, 2018

jnothman commented Nov 28, 2018 via email

amueller commented Nov 29, 2018

jnothman commented Dec 3, 2018 via email

Pipeline pop #8448

Pipeline pop #8448

Conversation

jnothman commented Feb 24, 2017

codecov bot commented Feb 24, 2017 • edited Loading

Codecov Report

glemaitre commented Feb 24, 2017

jnothman commented Feb 25, 2017 via email

amueller commented Nov 27, 2018

amueller commented Nov 27, 2018

jnothman commented Nov 28, 2018 via email

amueller commented Nov 28, 2018

jnothman commented Nov 28, 2018 via email

amueller commented Nov 29, 2018

jnothman commented Dec 3, 2018 via email

codecov bot commented Feb 24, 2017 •

edited

Loading