-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
DEP Adding a warning in the SimpleImputer when strategy mode is constant and keep_empty_features is False #29950
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEP Adding a warning in the SimpleImputer when strategy mode is constant and keep_empty_features is False #29950
Conversation
…ant and keep_empty_features is False
I'll start to make a review with a couple of changes to do. |
The current docstring of
|
So right now, the CI is failing because we raise the warning in some tests:
So we should look at each case, and check if it makes sense and catch the warning / change the test or if this is a false positive and a bug in the code. I'll review the code and propose the changes. |
So luckily, for the tests However, could you add the following on the top of each test:
For the remaining test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK here is a round of review.
Please add an entry to the change log at doc/whats_new/v1.6.rst
. Like the other entries there, please reference this pull request with :pr:
and credit yourself (and other contributors if applicable) with :user:
.
This will resolve the remaining CI failure related to checking the changelog.
Thank you, I will start working on all those reviews. |
…onstant, changing some tests, removing a useless test and updating the changelog
…le/scikit-learn into simple_imputer_add_warning
I've made the requested modifications based on your reviews. Sorry that I didn't have much time to focus on it earlier. Everything should be up to date now, and please let me know if there's anything else I need to adjust. I'm waiting for the CI results. |
No issue. Apparently the CI is not happy. I'll have a look at what is going on. |
The remaining failure is in another estimator called So here, you would need to catch the warning. The diff will look like: diff --git a/sklearn/impute/_iterative.py b/sklearn/impute/_iterative.py
index d2d47bae3f..a90a659980 100644
--- a/sklearn/impute/_iterative.py
+++ b/sklearn/impute/_iterative.py
@@ -635,6 +635,13 @@ class IterativeImputer(_BaseImputer):
X_missing_mask = _get_mask(X, self.missing_values)
mask_missing_values = X_missing_mask.copy()
+
+ # TODO (1.8): remove this once the deprecation is removed. In the meantime,
+ # we need to catch the warning to avoid false positives.
+ catch_warning = (
+ self.initial_strategy == "constant" and not self.keep_empty_features
+ )
+
if self.initial_imputer_ is None:
self.initial_imputer_ = SimpleImputer(
missing_values=self.missing_values,
@@ -642,9 +649,19 @@ class IterativeImputer(_BaseImputer):
fill_value=self.fill_value,
keep_empty_features=self.keep_empty_features,
).set_output(transform="default")
- X_filled = self.initial_imputer_.fit_transform(X)
+ if catch_warning:
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", FutureWarning)
+ X_filled = self.initial_imputer_.fit_transform(X)
+ else:
+ X_filled = self.initial_imputer_.fit_transform(X)
else:
- X_filled = self.initial_imputer_.transform(X)
+ if catch_warning:
+ with warnings.catch_warnings():
+ warnings.simplefilter("ignore", FutureWarning)
+ X_filled = self.initial_imputer_.transform(X)
+ else:
+ X_filled = self.initial_imputer_.transform(X)
if in_fit:
self._is_empty_feature = np.all(mask_missing_values, axis=0)
@@ -658,7 +675,7 @@ class IterativeImputer(_BaseImputer):
# The constant strategy has a specific behavior and preserve empty
# features even with ``keep_empty_features=False``. We need to drop
# the column for consistency.
- # TODO: remove this `if` branch once the following issue is addressed:
+ # TODO (1.8): remove this `if` branch once the following issue is addressed:
# https://github.com/scikit-learn/scikit-learn/issues/29827
X_filled = X_filled[:, ~self._is_empty_feature]
Applying it will make the test pass. |
sklearn/impute/_base.py
Outdated
.. versionchanged:: 1.6 | ||
Currently, when `keep_empty_feature=False` and `strategy="constant"`, | ||
empty features are not dropped. This behaviour will change in version | ||
1.8. Set `keep_empty_feature=True` to preserve this behaviour. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should solve the documentation failure, sphinx
is quite stubborn about those :).
.. versionchanged:: 1.6 | |
Currently, when `keep_empty_feature=False` and `strategy="constant"`, | |
empty features are not dropped. This behaviour will change in version | |
1.8. Set `keep_empty_feature=True` to preserve this behaviour. | |
.. versionchanged:: 1.6 | |
Currently, when `keep_empty_feature=False` and `strategy="constant"`, | |
empty features are not dropped. This behaviour will change in version | |
1.8. Set `keep_empty_feature=True` to preserve this behaviour. |
Thank you ! I have corrected what you told me, however I saw when fetching the main upstream that many lines have been deleted in v.1.6.rst file. I think it is the reason why the check changelog ci is failing. Do I need to modify another file for the changelog ? |
Indeed, we change the way we create changelog entry to have a more automated way. Now, we need to create a single file in the module folder. I push a commit to create the file in the last commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @ArthurCourselle
We will need a second reviewer.
Maybe @adrinjalali can have a look such that we get this one in to not delay the deprecation.
Ok thank you ! I'm looking forward for the other review then. I will look for other issues that I could work on in the future. Thanks @glemaitre for your help :) |
Nice. Thanks @ArthurCourselle for you contribution ;) |
Reference Issues/PRs
Fixes #29827
What does this implement/fix? Explain your changes.
Adding a warning in the SimpleImputer to prevent the user about an upcoming deprecation. This shows when the user set the keep_empty_features to False and the strategy mode is constant and the user have a column full of np.nan because it is not removed.