-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
Closed
Labels
Description
As of #29034, _weighted_percentile
handles NaNs by ignoring them when calculating percentile
.
np.median
and np.percentile
on the other hand, will return NaN if a NaN is present in the input (np.nanmedian
and np.nanpercentile
will ignore nans).
There are many cases in the codebase where, if sample_weight
is None
, a np
function is used (NaN returned), if sample_weight
is given, _weighted_percentile
used and NaNs ignored.
Summary of affected cases:
DummyRegressor.fit
AbsoluteError
/PinballLoss
/HuberLoss
-fit_intercept_only
methodmedian_absolute_error
d2_pinball_score
SplineTransformer._get_base_knot_positions
- I think this was the original reason for MNT_weighted_percentile
supports np.nan values #29034
Maybe we could assess on a case by case basis whether it makes sense to return NaN if present in the input? @ogrisel suggested that we may want to raise a warning in some cases as well.