-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
API Implements get_feature_names_out for transformers that support get_feature_names #18444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
# Conflicts: # sklearn/base.py # sklearn/impute.py # sklearn/preprocessing/data.py
…into pipeline_get_feature_names
…it-learn into pipeline_get_feature_names
Co-authored-by: Olivier Grisel <olivier.grisel@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after a fast pass over the code. I'd love to have it in 1.0 in I dare taking the risk to push the green button....
My plan is to follow up this PR with adding |
@thomasjpfan Are you fine with merging now to have this PR in v1.0? (given CI is green) |
@adrinjalali has already branched I don't mind having this in 1.1.0, hopefully that will be a good motivation to not wait too long to release 1.1 :) I let @adrinjalali decide on the fate of this PR. Based on his decision, we will have to update the what's new entry before merging. |
There is a CI failure anyway. |
I still haven't tagged, I'm happy to include this if it gets merged by tomorrow |
I recently opened #20919 that enabled My fix is in 560c0d0. Since |
I merged then! Thanks very much @thomasjpfan! @adrinjalali this will need a backport to |
…t_feature_names (#18444) Co-authored-by: Andreas Mueller <andreas.mueller@columbia.edu> Co-authored-by: Andreas Mueller <andreasmuellerml@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Olivier Grisel <olivier.grisel@gmail.com> Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
## August 31th, 2021 ### Gael * TODO: Jeremy's renewal, Chiara's replacement, Mathis's consulting gig ### Olivier - input feature names: main PR [#18010](scikit-learn/scikit-learn#18010) that links into sub PRs - remaining (need review): [#20853](scikit-learn/scikit-learn#20853) (found a bug in `OvOClassifier.n_features_in_`) - reviewing `get_feature_names_out`: [#18444](scikit-learn/scikit-learn#18444) - next: give feedback to Chiara on ARM wheel building [#20711](scikit-learn/scikit-learn#20711) (needed for the release) - next: assist Adrin for the release process - next: investigate regression in loky that blocks the cloudpickle release [#432](cloudpipe/cloudpickle#432) - next: come back to intel to write a technical roadmap for a possible collaboration ### Julien - Was on holidays - Planned week @ Nexedi, Lille, from September 13th to 17th - Reviewed PRs - [`#20567`](scikit-learn/scikit-learn#20567) Common Private Loss module - [`#18310`](scikit-learn/scikit-learn#18310) ENH Add option to centered ICE plots (cICE) - Others PRs prior to holidays - [`#20254`](scikit-learn/scikit-learn#20254) - Adapted benchmarks on `pdist_aggregation` to test #20254 against sklearnex - Adapting PR for `fast_euclidean` and `fast_sqeuclidean` on user-facing APIs - Next: comparing against scipy's - Next: Having feedback on [#20254](scikit-learn/scikit-learn#20254) would also help - Next: I need to block time to study Cython code. ### Mathis - `sklearn_benchmarks` - Adapting benchmark script to run on Margaret - Fix issue with profiling files too big to be deployed on Github Pages - Ensure deterministic benchmark results - Working on declarative pipeline specification - Next: run long HPO benchmarks on Margaret ### Arturo - Finished MOOC! - Finished filling [Loïc's notes](https://notes.inria.fr/rgSzYtubR6uSOQIfY9Fpvw#) to find questions with score under 60% (Issue [#432](INRIA/scikit-learn-mooc#432)) - started addressing easy-to-fix questions, resulting in gitlab MRs [#21](https://gitlab.inria.fr/learninglab/mooc-scikit-learn/mooc-scikit-learn-coordination/-/merge_requests/21) and [#22](https://gitlab.inria.fr/learninglab/mooc-scikit-learn/mooc-scikit-learn-coordination/-/merge_requests/22) - currently working on expanding the notes up to 70% - Continued cross-linking forum posts with issues in GitHub, resulting in [#444](INRIA/scikit-learn-mooc#444), [#445](INRIA/scikit-learn-mooc#445), [#446](INRIA/scikit-learn-mooc#446), [#447](INRIA/scikit-learn-mooc#447) and [#448](INRIA/scikit-learn-mooc#448) ### Jérémie - back from holidays, catching up - Mathis' benchmarks - trying to find what's going on with ASV benchmarks (asv should display the versions of all build and runtime depndencies for each run) ### Guillaume - back from holidays - Next: - release with Adrin - check the PR and issue trackers ### TODO / Next - Expand Loïc’s notes up to 70% (Arturo) - Create presentation to discuss my experience doing the MOOC (Arturo) - Help with the scikit-learn release (Olivier, Guillaume) - HR: Jeremy's renewal, Chiara's replacement (Gael) - Mathis's consulting gig (Olivier, Gael, Mathis)
…t_feature_names (scikit-learn#18444) Co-authored-by: Andreas Mueller <andreas.mueller@columbia.edu> Co-authored-by: Andreas Mueller <andreasmuellerml@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Olivier Grisel <olivier.grisel@gmail.com> Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
…t_feature_names (scikit-learn#18444) Co-authored-by: Andreas Mueller <andreas.mueller@columbia.edu> Co-authored-by: Andreas Mueller <andreasmuellerml@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Olivier Grisel <olivier.grisel@gmail.com> Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
Reference Issues/PRs
Fixes #6425
Closes #12627
Follows up on: #18010
What does this implement/fix? Explain your changes.
This PR adds:
get_feature_names_out
in transformers that haveget_feature_names
. This was motivated by Feature names with input features #13307 (comment). The signature ofget_output_names
is alwaysget_output_names(input_features=None)
whereinput_features
can be ignored.get_feature_names
is deprecated.The output of
get_feature_names_out
will be a list most of the time. The only case is wheninput_features
is not None, and the transformer is one-to-one. I considered making this is a list, but I do not think it is worth the copy.CC @amueller