-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
TST test_bagging_regressor/classifier_with_missing_inputs fails with SimpleImputer #11482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Uhm what is the reasoning to let infinity pass through the regression? |
So the argument is that previously inf was a valid value for |
Not sure but it seems that's because |
@jnothman apparently we don't allow infinite values in check_array while we could fill-up missing_values with inf previously. For me filling with |
FYI #9707 introduces the test. |
Yep but #9707 wanted to actually check that the BaggingRegressor let pass through NaN and infinite. I don't see why we should have a Imputer there. |
instead of an imputer we could have some other kind of non-finite-accepting
dummy regressor there.
I'm happy rejecting missing_values=inf but it may deserve documentation
|
@jeremiedbb What is your opinion on letting on the side |
@qinhanmin2014 You might want to update your PR as proposed by @jnothman. |
Well I'd say why not allow it, it does not require huge changes in the imputers, but I don't know if this is common to encode the missing values as np.inf. |
The PR is for someone who want to look at the complete log. I would prefer the author or one of the reviewer to submit a PR here since I'll need some time to go through the discussions in #11211. |
I get not wanting to impose restrictions, but this is different to, say,
forcing someone's data have no NaNs in a meta-estimator. Sometimes it's
okay/good to be dogmatic unless shown a need to be lenient. And inf in a
feature space *should* ring alarm bells.
|
I could not figure out any use case in which you would like to user specifically
I would agree on not letting it pass. |
if we really want to support imputation that makes no other conditions on
the values in the data, we could consider np.ma.masked_array. but I don't
think we need that at this stage, and pandas has chosen not to do missing
values like that.
|
So is there a PR/consensus on what to do? I don't think we need to be able to impute np.inf. |
See #11480
Git blame shows that we've introduced regression in #11211
ping the author @jeremiedbb and the reviewers @jnothman @glemaitre @ogrisel @jorisvandenbossche
Below are the logs from test_bagging_regressor_with_missing_inputs:
The text was updated successfully, but these errors were encountered: