-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[RFC] Voting classifier flatten transform (#7230) #7794
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Voting classifier flatten transform (#7230) #7794
Conversation
Also, I'm not sure whether regression test is needed. Because added functionality is pretty simple. |
I'd be tempted to deprecate the current behaviour. |
tests are always needed! |
Do you see harm in deprecating entirely? I'd rather that. The current |
I'm happy with deprecating entirely. That needs two steps, though: introducing the parameter and removing it again. Or what did you want to do? |
Also, it might be helpful to special case for binary classification and only retain probabilities of one of the classes, say the positive one? That would result in smaller and more interpretable downstream models. |
bonus points for implementing |
|
||
|
||
def test_transform(): | ||
"""Check trqansform method of VotingClassifier on toy dataset.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trqansform -> transform
flatten_transform=True).fit(X, y) | ||
|
||
assert_array_equal(eclf1.transform(X).shape, (3, 4, 2)) | ||
assert_array_equal(eclf2.transform(X).shape, (4, 6)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you check more than shapes with an assert_equal on the values after a proper reshape?
flatten_transform : bool, optional (default=False) | ||
Affects shape of transform output only when voting='soft' | ||
If voting='soft' and flatten_transform=True, transform method returns | ||
matrix with shape [n_samples, n_classifiers * n_classes] instead of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[n_samples, n_classifiers * n_classes]
->
(n_samples, n_classifiers * n_classes)
shapes are tuples
Affects shape of transform output only when voting='soft' | ||
If voting='soft' and flatten_transform=True, transform method returns | ||
matrix with shape [n_samples, n_classifiers * n_classes] instead of | ||
[n_classifiers, n_samples, n_classes]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
@@ -238,16 +247,25 @@ def transform(self, X): | |||
|
|||
Returns | |||
------- | |||
If `voting='soft'`: | |||
If `voting='soft'` and `flatten_transform=False`: | |||
array-like = [n_classifiers, n_samples, n_classes] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shape as tuple
- Added ``flatten_transform`` parameter to :class:`ensemble.VotingClassifier` | ||
to change output shape of `transform` method to 2 dimensional. | ||
(`#7794 <https://github.com/scikit-learn/scikit-learn/pull/7794>`_) | ||
by `Ibraim Ganiev`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be written as
:issue:7794
by Ibraim Ganiev
_.
How does this relate to our concurrent ambitions towards stacking (e.g. #8960) |
You'll also need to rebase on master, to get the tests running. |
Reference Issue
Fixes #7230
What does this implement/fix? Explain your changes.
It adds
flatten_transform
parameter toVotingClassifier
, which changes shape of transform method's output to[n_samples, n_classifiers * n_classes]
instead of[n_classifiers, n_samples, n_classes]
,With this parameter turned on you can use
VotingClassifier
as a transformer, and feed its output to other estimators/transformers inPipeline
.Any other comments?
None, make suggestions. I Summon @amueller into this PR :)