Skip to content

Make BaseShuffleSplit public #20056

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 12, 2021
Merged

Make BaseShuffleSplit public #20056

merged 1 commit into from
Jun 12, 2021

Conversation

pabloduque0
Copy link
Contributor

What does this implement/fix? Explain your changes.

Added BaseShuffleSplit to the init so it can be referenced directly for type hints without having to use both StratifiedShuffleSplit and ShuffleSplit when any type of shuffle split would suffice.

When accepting any shuffle split one has to do:

Union[model_selection.StratifiedShuffleSplit, model_selection.ShuffleSplit]

This is I believe unnecessary since both inherit from BaseShuffleSplit and could be easily solved by adding BaseShuffleSplit to the publicly visible init (since _split is not).

Also for accepting any splitter class from model_selection module in your function the type hints would need to be:

Union[model_selection.BaseShuffleSplit, model_selection.ShuffleSplit, model_selection.StratifiedShuffleSplit]

With this change now it would be only BaseShuffleSplit for any shuffle split and Union[model_selection.BaseCrossValidator, model_selection.BaseShuffleSplit] for any splitter class

Any other comments?

I have not seen anything why having this missing from the init file would be intentional but let me know if I missed something that would make this change mess something up.

Added BaseShuffleSplit to the __init__ so it can be referenced directly for type hints without having to use both StratifiedShuffleSplit and ShuffleSplit when any ShuffleSplit would suffice.
Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @pabloduque0

So this is making BaseShuffleSplit public now. BaseCrossValidator is already public (and both are undocumented in the API ref BTW, which is OK I guess)

I don't think there's a scenario where we would be able to break BaseShuffleSplit backward compat without also having to break ShufflSplit and StratifiedShuffleSplit anyway, so making this public doesn't seem to introduce new constraints. LGTM then

@NicolasHug NicolasHug changed the title Added BaseShuffleSplit to the __init__ so it can be referenced directly in type hints. Make BaseShuffleSplit public May 7, 2021
Copy link
Member

@jjerphan jjerphan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

This typing problem might happen for a variety of base classes.
For reference, such classes non-exclusively are:

  • BaseSearchCV
  • BaseSpectral
  • BaseForest
  • BaseDecisionTree

Making some of them public might come with harder constraints.

@pabloduque0: if you can share it. what is your context for type hinting BaseShuffleSplit?
Also, have you had the need to type hint other interfaces?

@pabloduque0
Copy link
Contributor Author

@jjerphan Thanks for your response!

In this case what I was trying to achieve is allow any KFold or similar in one of the parameters of a function.

For other base classes, especialy when it comes to models (eg. if your function can take any SKlearn model, or any regressor or any classifier) I tend to fall back to creating a protocol for structural subtyping which has fit, predict plus the methods included in BaseEstimator. This generally does the trick but I can see how it could not be optimal especially if you need to replicate this across multiple places.

Let me know if you would like further details :)

@jjerphan
Copy link
Member

jjerphan commented Jun 9, 2021

@pabloduque0: thanks for your message.

Types aren't used in the code base (for flexibility and historical reasons I guess), but it is interesting to have some users' feedback on this.

You might be interested by following #17799 and subsequent crossed-linked issues.

Copy link
Member

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@rth rth merged commit 978c91b into scikit-learn:main Jun 12, 2021
@adrinjalali adrinjalali mentioned this pull request Sep 3, 2021
6 tasks
eddiebergman added a commit to automl/auto-sklearn that referenced this pull request Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants