Skip to content

API Rename force_all_finite into ensure_all_finite #29404

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jul 25, 2024

Conversation

jeremiedbb
Copy link
Member

closes #29262

renames force_all_finite into ensure_all_finite in the following public functions:

  • check_array
  • check_X_y
  • as_float_array
  • check_pairwise_arrays
  • pairwise_distances

The ensure_xxx pattern already used by several parameters of these functions only means that some checks are activated or not but has no effect on the output array.

force_all_finite follows this behavior but does not follow this naming pattern. I think it should, especially since we now have a parameter force_writeable that has an effect on the output array.

Copy link

github-actions bot commented Jul 3, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 36d968e. Link to the linter CI: here

@jeremiedbb jeremiedbb marked this pull request as ready for review July 3, 2024 14:25
@jeremiedbb jeremiedbb force-pushed the rename-force-all-finite branch from dacdf67 to 4432e70 Compare July 3, 2024 14:34
@@ -1292,6 +1376,14 @@ def check_X_y(
f"{estimator_name} requires y to be passed, but the target y is None"
)

if force_all_finite != "deprecated":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should the behavior be when both are set?

check_array(force_all_finite=True, ensure_all_finite=False)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I implemented it for now is that if both are set, force_all_finite takes over.
I can see that we'd like to have an error if both are different though. Let me change that

Copy link
Member Author

@jeremiedbb jeremiedbb Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I'm not so sure. It makes the logic quite complicated.

What about the following ?
check_array(force_all_finite=False, ensure_all_finite=True)

ensure_all_finite=True is the default. The user already gets a warning to use ensure_all_finite instead of force_all_finite. I don't want to raise an additional error, while he just left the other to its default value. Otherwise we need a sentinel to be able to check if ensure_all_finite is set to a non-default value, but I don't think the added complexity is worth.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead, I added a comment in warning saying that ensure_all_finite is ignored when force_all_finite. I know that since ensure_all_finite is the new name it feels a bit weird and one would expect the opposite, but this way keeps things simple and avoids adding a complex logic.

What do you think ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm +0.5 on just improving the docs/warning message.

The alternative is to have ensure_all_finite=None as the default that behaves as True. Then if force_all_finite!="deprecated" and ensure_all_finite is not None, raise an error that says that are not be set at the same time?

In two versions, we update ensure_all_finite=True and it'll be backward compatible because None behaved the same way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alternative is to have ensure_all_finite=None

Yes that's what I was not enthusiastic about. I would be in favor of doing that only if we agree to remove None as a valid option in 1.8 without deprecation 😄. Because otherwise I find that a 4 releases cycle is too long for a renaming.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a developer tools, I think I'm fine avoiding to have the default sentinel None and later change the default.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what @glemaitre is suggesting here. But I would rather use None, not document it, and later remove it after deprecation of force_...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it so that now an error is raised when both are set. I added a comment to remove None in 1.8 without deprecation.

Comment on lines +16 to +18
@pytest.mark.filterwarnings(
"ignore:'force_all_finite' was renamed to 'ensure_all_finite':FutureWarning"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this coming from lightgbm?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, technically from scikit-learn as a dependency of lightgbm.

Comment on lines 130 to 131
.. versionadded:: 0.22
.. versionadded:: 0.20
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was this wrong?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, just a copy paste mistake on my end. I fixed that.

Comment on lines 150 to 157
.. versionadded:: 0.20
Accepts the string ``'allow-nan'``.

.. versionchanged:: 0.23
Accepts `pd.NA` and converts it into `np.nan`

.. versionadded:: 1.6
`force_all_finite` was renamed to `ensure_all_finite`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we only need a version added I guess?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing, if we don't copy it here, is that in 2 versions when we remove force_all_finite, then this information will disappear. I find it weird to remove the history for a renaming of a parameter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this information is relevant anymore when we're renaming the parameter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @adrinjalali. Basically since we change the name of parameter, we cannot go back to version 0.23 because this parameter is not existing there.

We just need to make sure that the options added are mentioned in the docstring that is the case here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I removed that

Comment on lines 181 to 187
if force_all_finite != "deprecated":
warnings.warn(
"'force_all_finite' was renamed to 'ensure_all_finite' in 1.6 and will be "
"removed in 1.8. Until then, ensure_all_finite is ignored when "
"force_all_finite is set.",
FutureWarning,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usually in these cases we raise if both are set. Only the new one should be set. More like

if force_all_finite != "deprecated" and ensure_all_finate != "default":
  raise ...
elif force_all_finite != "deprecated":
  warn("use ensure...")
  ensure_... = force_...
elif ensure_... == "default":
  ensure_... = True

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the discussion here #29404 (comment), and especially my last comment #29404 (comment). I agree with you, but I just want to make sure that we're on the same page regarding the removal of "default" or None (whichever we chose) in 1.8 without deprecation.

@glemaitre glemaitre self-requested a review July 25, 2024 08:20
Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I considered to be slightly more lenient because we are touching to a developer tool. If it would be an estimator, I would go through the 4 deprecation cycles thingy.

`force_all_finite` was renamed to `ensure_all_finite` and will be removed
in 1.8.

ensure_all_finite : bool or 'allow-nan', default=True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with the approach of using None as the default and documenting that it's True.

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM.

Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
@adrinjalali adrinjalali merged commit 1c47967 into scikit-learn:main Jul 25, 2024
30 checks passed
MarcBresson pushed a commit to MarcBresson/scikit-learn that referenced this pull request Sep 2, 2024
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API rename force_all_finite into ensure_all_finite in check_array ?
4 participants