-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
ENH add support for 'passthrough'
in FeatureUnion
#20860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH add support for 'passthrough'
in FeatureUnion
#20860
Conversation
@amueller I have added these new changes and I removed the |
@amueller I am not able to understand how shall I fix this Check Changelog check. Could you please help me regarding this? |
Please add an entry to the change log at |
@glemaitre thanks. I did that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the test, we should add a case where we passthrough features but with a weight to check that the features are still transformed by the weight.
sklearn/tests/test_pipeline.py
Outdated
assert_array_equal(X, ft.fit_transform(X)[:, -columns:]) | ||
assert not record | ||
|
||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need this line
'passthrough'
in FeatureUnion
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
…eel/scikit-learn into passthroughFeatureUnion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass. Indeed we probably need additional tests to cover the interaction of "passthrough" with get_feature_names
.
Note that this will be impacted by #18444 once merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shubhraneel Would you be able to finish this PR?
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
@glemaitre Yes, I will be able to finish this PR |
@glemaitre I have addressed all the review points. And, I have added the passthrough with transform_weights in test_pipeline too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR @shubhraneel !
Tests look good. Some comments on the implementation.
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Co-authored-by: shubhraneel <shubhraneel@gmail.com> Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Reference Issues/PRs
Fixes #20699
What does this implement/fix? Explain your changes.
FeatureUnion._validate_transformers()
to allow 'passthrough'FeatureUnion._iter()
to useFunctionTransformer(none)
in place of passthrough to represent the identity transformer.FeatureUnion._passthrough_function()
which returns aFunctionTransformer(none)
withget_feature_names
set. This separate function is needed as theFunctionTransformer
class does not have aget_feature_names
function which is needed byFeatureUnion
.test_set_feature_union_passthrough()
to test the functionality.Any other comments?
It may be better to check if more than one transformers are not set to passthrough as having a repetition of the same features is redundant.