Skip to content

Conversation

it176131
Copy link
Contributor

@it176131 it176131 commented Dec 21, 2022

Reference Issues/PRs

ENHANCEMENT #24754

What does this implement/fix? Explain your changes.

The FeatureUnion class did not previously have the .feature_names_in_ attribute if fitted with a pandas.DataFrame. This allows access to the attribute.

Any other comments?

  • modified: sklearn/pipeline.py

    • added self._check_feature_names(...) to the .fit(...) method in FeatureUnion to allow access to the .feature_names_in_ attribute if X has features names, e.g. a pandas.DataFrame
    • updated FeatureUnion docstring to reflect the addition of .feature_names_in_ attribute
  • modified: sklearn/tests/test_pipeline.py

    • added test_feature_union_feature_names_in_() to test that FeatureUnion has a .feature_names_in_ attribute if fitted with a pandas.DataFrame and not if fitted with a numpy array

	- added self._check_feature_names(...) to the .fit(...) method in FeatureUnion to allow access to the `.feature_names_in_` attribute if `X` has features names, e.g. a pandas.DataFrame
	- updated FeatureUnion docstring to reflect the addition of .feature_names_in_ attribute

modified:   sklearn/tests/test_pipeline.py
	- added test_feature_union_feature_names_in_() to test that FeatureUnion has a `.feature_names_in_` attribute if fitted with a pandas.DataFrame and not if fitted with a numpy array
	- changelog updated with description of work
@it176131 it176131 changed the title FIX FeatureUnion feature_names_in_ attribute FEATURE FeatureUnion feature_names_in_ attribute Dec 22, 2022
	- made changelog description more precise
	- typo -- removed period (.) before `columns`
Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

	- removed `self._check_feature_names(...) from `.fit(...)` method in `FeatureUnion`
	- added `feature_names_in_()` property to `FeatureUnion` to use first transformer's `feature_names_in_` attribute if present

modified:   sklearn/tests/test_pipeline.py
	- updated docstring for `test_feature_union_feature_names_in_()` to be more precise
	- added additional assertions to check if the `feature_names_in_` attribute is available to `FeatureUnion` if it's instantiated with a transformer that has already been fit
	- updated changelog description to include `pandas.DataFrame`
	- corrected user signature to match github account
@glemaitre glemaitre changed the title FEATURE FeatureUnion feature_names_in_ attribute ENH add feature_names_in_ in FeatureUnion Dec 23, 2022
@it176131 it176131 requested review from thomasjpfan and Kshitij68 and removed request for thomasjpfan and Kshitij68 December 29, 2022 16:27
Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the update!

it176131 and others added 4 commits January 3, 2023 08:56
newline/whitespace between change log updates.

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
added period at end of docstring

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
	- removed train-test-split per code suggestion -- using `X` directly
@it176131
Copy link
Contributor Author

it176131 commented Jan 3, 2023

@thomasjpfan -- I've made the changes you suggested. Ready for another review! 🙂

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thomasjpfan thomasjpfan added the Waiting for Second Reviewer First reviewer is done, need a second one! label Jan 3, 2023
@it176131 it176131 changed the title ENH add feature_names_in_ in FeatureUnion [MRG] ENH add feature_names_in_ in FeatureUnion Jan 10, 2023
@betatim betatim added Quick Review For PRs that are quick to review and removed Waiting for Second Reviewer First reviewer is done, need a second one! labels Jan 13, 2023
@glemaitre glemaitre self-requested a review January 14, 2023 16:34
@glemaitre glemaitre changed the title [MRG] ENH add feature_names_in_ in FeatureUnion ENH add feature_names_in_ in FeatureUnion Jan 14, 2023
Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @it176131

@glemaitre glemaitre merged commit bf03a63 into scikit-learn:main Jan 14, 2023
@it176131 it176131 deleted the feature_union_feature_names_in_ branch February 1, 2023 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module:pipeline Quick Review For PRs that are quick to review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants