-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
ENH add metadata routing to ColumnTransformer #27005
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH add metadata routing to ColumnTransformer #27005
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @adrinjalali . Here are a few comments considering files other than the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Only some nitpicks.
@@ -591,7 +613,9 @@ def _update_fitted_transformers(self, transformers): | |||
transformers_ = [] | |||
self._name_to_fitted_passthrough = {} | |||
|
|||
for name, old, column, _ in self._iter(): | |||
for name, old, column, _ in self._iter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this function was complex to understand, do you mind putting a docstring (at least a summary).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have another branch where I'm doing more docstrings and some refactoring, adding the docstring there. Wanted to keep this PR somewhat small.
_transform_one. | ||
|
||
fitted : bool | ||
Used to get an iterable of transformers. If True, use the fitted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could almost infer this parameter from the func
callable: if requesting _transform
then you expect it to be fitted=True
, if _fit
in the name then fitted=False
. I don't know if we having the keyword makes more explicit in some ways?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair enough, remember to leave this comment in the upcoming refactoring PR please 😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small suggestion, otherwise LGTM. Thanks @adrinjalali
Co-authored-by: Omar Salman <omar.salman@arbisoft.com>
fitted=False, column_as_strings=False, replace_strings=True | ||
): | ||
if step is None: | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be safer to have a test for this uncovered line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart of the missing line, LGTM.
@glemaitre doesn't seem like we're checking for a step being |
Fine with me if the tests are passing. Activating the auto-merge here. |
Co-authored-by: Omar Salman <omar.salman@arbisoft.com>
Co-authored-by: Omar Salman <omar.salman@arbisoft.com>
Co-authored-by: Omar Salman <omar.salman@arbisoft.com>
Towards #22893
This adds metadata routing to
ColumnTransformer
.It also adds some docstrings to private methods and does a a bit of refactoring / clean up.
Fixes #19465
Fixes #24490