Skip to content

Determining parameter behavior based on float / int type #7973

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
amueller opened this issue Dec 3, 2016 · 11 comments
Open

Determining parameter behavior based on float / int type #7973

amueller opened this issue Dec 3, 2016 · 11 comments
Labels
Needs Decision Requires decision

Comments

@amueller
Copy link
Member

amueller commented Dec 3, 2016

There are many places in scikit-learn where ints and floats (usually between 0 and 1) are allowed, and have different semantics.
Most of us think this is brittle. This is an issue to discuss whether we should deprecate and use a different interface, and what that interface should be.

I think the main issue arises when people pass 1.0 vs 1 and 0.0 vs 0 (though they are likely to have the same semantics).
I think we can keep the current semantics (as they are very concise) as long as we are careful about 1.0 and 1.

@GaelVaroquaux proposed to use a more explicit syntax, like "10%", which is a non-backward-compatible change and needs a full deprecation of all floats 0-1. There is a question on whether this applies to all floats between 0 and 1 or only those where also an integer is possible.

My half-baked idea would be to replace 1.0 by "all", which is quite intuitive. I'm not sure what to do about 1, though. We could require something like "one", which would be a bit inconsistent, but solve all use-cases (grid-searching from 1 to 10 will be odd, though).

However, if we do the %, then grid-searching fractions becomes pretty ugly.

@amueller
Copy link
Member Author

amueller commented Dec 3, 2016

Ok so criteria for a solution I think should be

  1. How many current use-cases will be deprecated?

  2. How concise can common usage patterns be expressed?

  3. How concise can grid-search be expressed?

  4. Do we need deprecations when we allow the other type for a parameter that only allowed one type before?

Hm and clearly also whether the approach "solves" the ambiguity. Though I'm actually not sure what it means to "solve the ambiguity". My definition was "Every usage that doesn't raise an error has 'obvious' behavior".

@amueller
Copy link
Member Author

amueller commented Dec 3, 2016

Maybe another good definition of "Solving ambiguity" would be "typecasting a valid parameter never becomes a valid parameter with changed semantics"

@amueller
Copy link
Member Author

amueller commented Dec 3, 2016

Wow I didn't realize but all the options in the trees actually accept both 1 and 1.0 with different semantics. I know we discussed that before but ... hm...
We could satisfy the second definition by just deprecating 1.0 (replace by "all") and allowing 1

@jnothman
Copy link
Member

jnothman commented Dec 4, 2016 via email

@amueller
Copy link
Member Author

amueller commented Dec 5, 2016

@jnothman do you have an example?

@GaelVaroquaux
Copy link
Member

GaelVaroquaux commented Dec 5, 2016 via email

@jnothman
Copy link
Member

jnothman commented Dec 5, 2016 via email

@jnothman
Copy link
Member

jnothman commented Dec 5, 2016 via email

@amueller
Copy link
Member Author

amueller commented Dec 5, 2016

@jnothman as something that we already have or something we might want to consider in the future? Maybe the resampling has that... hum

@jnothman
Copy link
Member

jnothman commented Dec 5, 2016 via email

@adrinjalali
Copy link
Member

@amueller do you still think we're gonna fix this? :D Removing from the milestone, please add to 2.0 if you think this should be prioritized.

@adrinjalali adrinjalali removed this from the 1.0 milestone Aug 22, 2021
@cmarmo cmarmo added the Needs Decision Requires decision label Dec 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Decision Requires decision
Projects
None yet
Development

No branches or pull requests

5 participants