-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
API Rename force_all_finite into ensure_all_finite #29404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API Rename force_all_finite into ensure_all_finite #29404
Conversation
dacdf67
to
4432e70
Compare
sklearn/utils/validation.py
Outdated
@@ -1292,6 +1376,14 @@ def check_X_y( | |||
f"{estimator_name} requires y to be passed, but the target y is None" | |||
) | |||
|
|||
if force_all_finite != "deprecated": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should the behavior be when both are set?
check_array(force_all_finite=True, ensure_all_finite=False)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I implemented it for now is that if both are set, force_all_finite
takes over.
I can see that we'd like to have an error if both are different though. Let me change that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I'm not so sure. It makes the logic quite complicated.
What about the following ?
check_array(force_all_finite=False, ensure_all_finite=True)
ensure_all_finite=True
is the default. The user already gets a warning to use ensure_all_finite instead of force_all_finite. I don't want to raise an additional error, while he just left the other to its default value. Otherwise we need a sentinel to be able to check if ensure_all_finite
is set to a non-default value, but I don't think the added complexity is worth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead, I added a comment in warning saying that ensure_all_finite
is ignored when force_all_finite
. I know that since ensure_all_finite
is the new name it feels a bit weird and one would expect the opposite, but this way keeps things simple and avoids adding a complex logic.
What do you think ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm +0.5 on just improving the docs/warning message.
The alternative is to have ensure_all_finite=None
as the default that behaves as True
. Then if force_all_finite!="deprecated"
and ensure_all_finite is not None
, raise an error that says that are not be set at the same time?
In two versions, we update ensure_all_finite=True
and it'll be backward compatible because None
behaved the same way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alternative is to have
ensure_all_finite=None
Yes that's what I was not enthusiastic about. I would be in favor of doing that only if we agree to remove None
as a valid option in 1.8 without deprecation 😄. Because otherwise I find that a 4 releases cycle is too long for a renaming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a developer tools, I think I'm fine avoiding to have the default sentinel None
and later change the default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what @glemaitre is suggesting here. But I would rather use None
, not document it, and later remove it after deprecation of force_...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it so that now an error is raised when both are set. I added a comment to remove None
in 1.8 without deprecation.
@pytest.mark.filterwarnings( | ||
"ignore:'force_all_finite' was renamed to 'ensure_all_finite':FutureWarning" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this coming from lightgbm
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, technically from scikit-learn as a dependency of lightgbm.
sklearn/metrics/pairwise.py
Outdated
.. versionadded:: 0.22 | ||
.. versionadded:: 0.20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, just a copy paste mistake on my end. I fixed that.
sklearn/metrics/pairwise.py
Outdated
.. versionadded:: 0.20 | ||
Accepts the string ``'allow-nan'``. | ||
|
||
.. versionchanged:: 0.23 | ||
Accepts `pd.NA` and converts it into `np.nan` | ||
|
||
.. versionadded:: 1.6 | ||
`force_all_finite` was renamed to `ensure_all_finite`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we only need a version added I guess?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing, if we don't copy it here, is that in 2 versions when we remove force_all_finite, then this information will disappear. I find it weird to remove the history for a renaming of a parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this information is relevant anymore when we're renaming the parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @adrinjalali. Basically since we change the name of parameter, we cannot go back to version 0.23 because this parameter is not existing there.
We just need to make sure that the options added are mentioned in the docstring that is the case here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I removed that
sklearn/metrics/pairwise.py
Outdated
if force_all_finite != "deprecated": | ||
warnings.warn( | ||
"'force_all_finite' was renamed to 'ensure_all_finite' in 1.6 and will be " | ||
"removed in 1.8. Until then, ensure_all_finite is ignored when " | ||
"force_all_finite is set.", | ||
FutureWarning, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
usually in these cases we raise if both are set. Only the new one should be set. More like
if force_all_finite != "deprecated" and ensure_all_finate != "default":
raise ...
elif force_all_finite != "deprecated":
warn("use ensure...")
ensure_... = force_...
elif ensure_... == "default":
ensure_... = True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the discussion here #29404 (comment), and especially my last comment #29404 (comment). I agree with you, but I just want to make sure that we're on the same page regarding the removal of "default" or None (whichever we chose) in 1.8 without deprecation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I considered to be slightly more lenient because we are touching to a developer tool. If it would be an estimator, I would go through the 4 deprecation cycles thingy.
`force_all_finite` was renamed to `ensure_all_finite` and will be removed | ||
in 1.8. | ||
|
||
ensure_all_finite : bool or 'allow-nan', default=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with the approach of using None
as the default and documenting that it's True
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise LGTM.
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
closes #29262
renames
force_all_finite
intoensure_all_finite
in the following public functions:check_array
check_X_y
as_float_array
check_pairwise_arrays
pairwise_distances
The
ensure_xxx
pattern already used by several parameters of these functions only means that some checks are activated or not but has no effect on the output array.force_all_finite
follows this behavior but does not follow this naming pattern. I think it should, especially since we now have a parameterforce_writeable
that has an effect on the output array.