-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Make BaseShuffleSplit public #20056
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make BaseShuffleSplit public #20056
Conversation
Added BaseShuffleSplit to the __init__ so it can be referenced directly for type hints without having to use both StratifiedShuffleSplit and ShuffleSplit when any ShuffleSplit would suffice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @pabloduque0
So this is making BaseShuffleSplit
public now. BaseCrossValidator
is already public (and both are undocumented in the API ref BTW, which is OK I guess)
I don't think there's a scenario where we would be able to break BaseShuffleSplit backward compat without also having to break ShufflSplit and StratifiedShuffleSplit anyway, so making this public doesn't seem to introduce new constraints. LGTM then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
This typing problem might happen for a variety of base classes.
For reference, such classes non-exclusively are:
BaseSearchCV
BaseSpectral
BaseForest
BaseDecisionTree
Making some of them public might come with harder constraints.
@pabloduque0: if you can share it. what is your context for type hinting BaseShuffleSplit
?
Also, have you had the need to type hint other interfaces?
@jjerphan Thanks for your response! In this case what I was trying to achieve is allow any KFold or similar in one of the parameters of a function. For other base classes, especialy when it comes to models (eg. if your function can take any SKlearn model, or any regressor or any classifier) I tend to fall back to creating a protocol for structural subtyping which has Let me know if you would like further details :) |
@pabloduque0: thanks for your message. Types aren't used in the code base (for flexibility and historical reasons I guess), but it is interesting to have some users' feedback on this. You might be interested by following #17799 and subsequent crossed-linked issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
What does this implement/fix? Explain your changes.
Added BaseShuffleSplit to the init so it can be referenced directly for type hints without having to use both StratifiedShuffleSplit and ShuffleSplit when any type of shuffle split would suffice.
When accepting any shuffle split one has to do:
This is I believe unnecessary since both inherit from BaseShuffleSplit and could be easily solved by adding BaseShuffleSplit to the publicly visible init (since _split is not).
Also for accepting any splitter class from model_selection module in your function the type hints would need to be:
With this change now it would be only
BaseShuffleSplit
for any shuffle split andUnion[model_selection.BaseCrossValidator, model_selection.BaseShuffleSplit]
for any splitter classAny other comments?
I have not seen anything why having this missing from the init file would be intentional but let me know if I missed something that would make this change mess something up.