-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
ENH allow extra params to be copied in clone #20681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be called non_init_params
?
@@ -628,6 +628,13 @@ class NonConformantEstimatorNoParamSet(BaseEstimator): | |||
def __init__(self, you_should_set_this_=None): | |||
pass | |||
|
|||
class ConformantEstimatorWithExtraParams(BaseEstimator): | |||
def _more_tags(self): | |||
return {"extra_params": ["param1"]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A first thought: this is hard to extend with inheritance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @amueller wasn't very happy about a manual MRO walk
I am not sure that you can otherwise if one wants to inherit extra_params
here.
The problem spotted here would also be true for other tags that would accept an iterable, e.g. preserve_dtypes
and X_types
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case, that's not an issue which we'd like to solve in this PR I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree it's an issue for any tag with a collection for a value, but I'm not sure that those other examples would warrant the use of inheritance to extend them as much as the present case, since "what gets cloned" is likely to be the sort of thing added to by a mixin. One solution is to use a prefixing system with scalar values rather than a collection value. But I'm okay leaving this concern alone, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now it's not clear to me if this would be used anywhere else than for '_metadata_request' in practice,
Yep, at least inside sklearn, I don't see us adding anything else anytime soon, or too frequently at all. And however it's used by external libraries, I'm sure they can find a workaround to the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And however it's used by external libraries, I'm sure they can find a workaround to the issue.
I think we need to at least provide some guidance and what this work around looks like.
We can special case non_init_params
here:
Lines 348 to 349 in 4b0d291
more_tags = base_class._more_tags(self) | |
collected_tags.update(more_tags) |
and have it actual extend the list.
Are there any use cases we see for non_init_params
besides _metadata_request
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can special case non_init_params here:
It'll make that part of the code a bit dirty, but I won't oppose. Should it be included in this PR?
I think we need to at least provide some guidance and what this work around looks like.
Since the workaround would apply to all lists, should we have it in a different PR?
Are there any use cases we see for non_init_params besides _metadata_request?
As far as I'm concerned, metadata_request
is the only one, but I guess third party developers can get creative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll make that part of the code a bit dirty, but I won't oppose. Should it be included in this PR?
I am thinking if that is the workaround third parties would have to do.
I guess third party developers can get creative.
I am starting to see it as a hack to get freezing to work. Imagine listing out all the attributes you want to "freeze" in non_init_params
.
As far as I'm concerned, metadata_request is the only one
In #20350, was there any reason for a developer to know that _metadata_request
exist and need to interact with metadata_request
directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking if that is the workaround third parties would have to do.
The workaround would be for them to get the tags in _more_tags
, and make sure the extend it, rather than returning only new ones, which is not a bad hack.
I am starting to see it as a hack to get freezing to work. Imagine listing out all the attributes you want to "freeze" in non_init_params.
It'll probably involve more than that for meta-estimators, but I like the idea.
In #20350, was there any reason for a developer to know that _metadata_request exist and need to interact with metadata_request directly?
I don't think so, they pretty much would work with the helper functions and classes.
Are we happy with this now? |
I do like this being in tags, but I am starting to feel uneasy about tags after the I do not think need to rush this PR because the |
@thomasjpfan this PR came about to be since I was cleaning up the sample props PR and getting it ready for a second vote. We'll need another month once that PR is finished to vote on it, and probably some more discussion time, and that process is blocked by this PR. So I'd say we're already behind the schedule for sample props to get in 1.1 I'm also not sure if we need to wait for the extendability of the lists in estimator tags to be fixed before we merge this. That issue is orthogonal to this PR, and can be resolved after this PR (also #20681 (comment)). So I'd be happy if we don't block this PR by #20804. |
@adrinjalali Do you think we can make progress on #20350 with hardcoding I do not want to rush out this PR because of how it is using |
Resolved by #24568 |
Fixes #20585
This PR proposes a new tag:
"extra_params"
(open to better names), which would indicate which non__init__
parameters should be copied inclone
.